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FURTHER APPLICATIONS IN STATISTICS OF THE 
T,,(2) BESSEL FUNCTION. 


By KARL PEARSON, S. A. STOUFFER anp F. N. DAVID*. 


(1) Tue 7;,, (x) function was defined in a paper by Pearson, Jeffery and Elderton+ 
to be given by 
x 1 


Var 2™T(m+4) 


A (x) = Km» (x) (i), 


where X,,,() is the Bessel Function of the second order and imaginary argument. 
Here 7',,(«)=T,,(—), while # on the right is always to be given its numerical 
value. Remembering this, we need not write |«|"X,,(|#|) in the equation. 


If y=MT,,(2) 


be treated as a frequency curve, it will be symmetrical and run from — 2 to +0 of z. 
The constant in (i) has been so chosen that 





oe ra 
| ydx = 2M | T'm (x) dx = M. 
7-< 0 
An integral form of K,,(«) is given by} 
Kn(ayee ne [P(e tt (iii) 
A m(e2)= . - SG. ccccccccesevcce ov le 
2" 1 (m+) Ji 
Hence we may write (ii) in the form 
M ] Pm ibn : i 
Y=, ra Ti ae | es ek | ee ee (iv). 
: -_ I (m+ 3) Ji , 


(2) Consider in the next place the curve 


PE 7  g\P 
y=Yye 4 U1 + ) ssuscuqceberuseueee bbseeeeqean (v), 
re ; 

the origin being the mode at distance a from the start of the curve. 


It follows easily that 
M pee 
Ye= — A — 
: a T'(p+1) 


where M is the total frequency. 


The suggestion of the problem and the selection of the illustrative examples were provided by 
S. A. Stouffer, the solution through the 7’,, (x) function was given by K. Pearson, who is also responsible 
for the text. Florence N. David computed the table of the probability integral of the 7,,,(x) distribution. 
+ Biometrika, Vol. xxt. p. 184. 
+ G.N. Watson: A Treatise on the Theory of the Bessel Functions, p. 172, Equation (4), 


Biometrika xx1v 19 
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Thus the curve can be written 


»(14= 
ye) 2) bo a cae (vii). 


Write z=p (1 +") and the moments about the start of the curve can be found at 
\ ¢ 

once. These lead to* 

Mean =Z’ =a(p+1)/p 

Standard Deviation =a=avptl/p | ‘cit, 

A=, fa8+— 
* pti’ 2=3 pt+l 
providing the well-known relation, 28; —38,—6=0. 


(3) Now suppose there are two independent variates wu and v both of which have 
frequency distributions provided by Equation (vii). We assume the two distributions 
to have the same p, but to have different standard deviations o and oe, or, what 
amounts to the same thing, different modal distances a and b. We will measure our 
variates wu and »v from the start of their curves, which then take the form 


» —2 pu 
dai NE 


and yw=Mre e @ (Bry y/r (p +1). 
If we take w= M ax WE we obtain the combined frequency surface 
4 
MPP 1 e (as) (BEBE ix 
w=M ot l'(p +1? ( ) a (1x). 


. - u v - 4) w\ ‘ ‘4 J : 
Now put X =p is 4 ) and Y=p 5 aif then the element for integration of the 


(pv , . , 717 
above surface is dud», or if we take it d (P ‘)d (5) we may replace it by dX dY, | 
i 
and we have for integration 
M 


l?(p +1) 2” teal? eee i. gf } Creer (x). 


We have to integrate this out for X to get the distribution curve of Y. In the 
upper octant XOB (Fig. 1, p. 295) the limit for X is clearly X = Y to X = along 


the shaded area. Or, the curve of distribution of Y is 


M rx 
=r@p+ie |, ° E(X9— V2)PAGX oo... ceceeeee (x bis). 
Put X = Yt and we have 
J , 3 : 
= r(p _ ) 98 Gia ie | EF ee BIE snccienisscenesies (xi). 
=" J1 


* Phil. Trans., Vol. 1854, p. 373. 
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If we take the lower octant XOA, the limits of X are — Y to #, but as Y is now 
negative we get precisely the same result, or we say that the whole curve of distri- 
bution of Y is (xi), Y being taken as positive, and from 0 to 0 ,and mirrored in the 


axis of X. This result also flows from the fact that the distribution of a must be 
a 


b 


a symmetrical curve, as the frequency curves for u/a and v/b are identical. 
Now if in (iv) we write =Y, m=p +H, we see that the z of (xi) is given by 
2=MT,,,(Y) chib shasta piensa nye (xii), 


which leads to $M for the area of our half curve. In other words cur curve for ‘ is 
the ee curve mirrored on itself. _The ordinates of this curve have been computed 


by Dr E. M. Elderton*. 








B 
Y x 

vw 

CG 
™ 

o 

e 
Oo pu/a " 
Fig. 1. 


(4) Now the odd moments of the mirrored curve vanish. Let us find the even 
moment-c efficients. We have from (x) 
M il ones 30.. we = 
Mus, = 2 -3—~—— sz, || Ye-* (X?- ¥2)dVaX, 
“(p+1)2” J) 
where the limits of X and Y are to be chosen so as to cover the upper octant BOX. 
Now if we integrate first with regard to Y, the limits will be from 0 to X, and then 
with regard to X from 0 to «2. ‘Thus 


1 the on - rx Oo et "2 > , a Yr ooo 
Me BAITED |, o*), YA — Y*)d¥aX ......... (xiii). 
Put Y=X) and we have 
1 fr oO 





rl 
aa ae >—-X Y 2st2pt1 2s (1 _ )\2)\p , 
Hes 2-1T? (pn +1) Jo x ? ) (1—nr*)? dvd, 


* See Biometrika, Vol. xxi. pp. 194—201, or Tables for Statisticians and Biometricians, Part II. 
pp. Ixxix—lIxxxviii and 138—144, 


19—2 
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or, if 7 =x, 

















1 F ore 
See 2 9 ak —K«)P 
1 hides a P(s+4)0(pt+)) 
“slaps, +? +2)—~ Tet +8) 
Ifs=<0 ” 1 P (2p + 2)T(3)_ 
oa ome MO 2T (p+1) (p+) 
; _T(2s+2p+2) [P(p+§) T(s+4) fi 
Hence Has = T@p+2) Te+pt) s  ipethamptatac (xiv), 
2p+3)(2p+2)1_. ; - 
and a P oat 5= pt? Ber Sn ne hee AA (xv). ‘ 
Generally Pas = (28 — 1) (2p + 2s) page ..--serecerccceceees (xv bis), 
oy gx He 28 12P+28) pave 
a (way p+ 2 (way? 
2(s—1) = 
or, Buss = (2s ape 1) (1 + 2(p +1 ) Baog- , Tee eee eee (xvi). 


Thus finally 


Besa = (28 —1)(28—8)...1 (1 +53) (1+=>5) im (14+ ) ...(xvii). 
Ws ptl pti pt+l 


It will be clear that when p-» x» we obtain 
Bose = (28 — 1) (2s — 3)... 1, 


the familiar 82,2 formula for the normal curve, into which the T,+4 function then 


passes. 


Consider the Type VII curve 


1 
a Fe 
Here we have 
2(s—1) 2 (s— 2) 2 ) 
— i 9c 3 Z - eee ee 
Bes—2 = (28 — 1) (2s 3) (14 9 ) (14+ 5) es i 


and pe = a?/(n — 3). 


Now it is clear that we can make He and wy agree in the Type VII and the 7.4 
curves*, but farther than that we cannot go, although the A,’s may not differ widely 
if n be considerable. The T p44 curve has the further advantage that no moment- 
coefficients tend to become infinite, while if n be an odd integer, those for the Type VII 
curve may become so. For values of p not too great the Type VII will fit the dis- 
tribution of Y considerably better than the normal curve. For considerable values 
of p, both Type VII and the T,,,4 curves pass into the normal curve. 


(5) A few further points may be noted. If p=—} the 7p-curve asymptotes to 


the vertical at the origin, and this holds as long as p lies between —} and 0; if 


9 
5 es | or n=2p+7, and a=2N(p + 1) (p+ 2). 


> 1 
* We must take 
L 
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p= 0, the Tj-curve starts with a finite ordinate and makes a finite angle with the 
vertical, it is the exponential curve. If p be positive we see from (x bis) that 
dz/dY=0 for Y=0, or tne double mirror curves have a common tangent at the 
axis of symmetry and will in appearance form a single curve. If p be a positive 
integer it is possible to expand z in powers of Y, but the series does not present 
any great advantages to the computer. 


When p=11, Dr Elderton’s Tables terminate, but it is shown in the memoir by 
Pearson, Jeffery and Elderton* that when p= 11, the two curves 


z=MT,.4(Y) 





and 
J {4(2n+7)} oA ae 
ee. Sees _ EU Cp+ .. eae (xviii) 
V2 (p+1)(p+2) T(p+3) (1 inal ee 
4(p+1)(p+ 2) 


coincide for practical statistical purposes. The areas of this latter curve up to given 
values of Y have been tabled+ from p=—4 to p=12, but this hardly carries us 
beyond the 7’,,-tables. The completed (and now at press) 7'ables of the Incomplete 
B-function carry us up to 2p +7 =101, or p=47. 


(6) Now let us turn to the means of samples of size n drawn from the Type IIT 
curve 
: Ps pr a\P ’ 
y=ye ¢ skeuavnaeveoncehaned edeeeel (xix), 


a 


where the origin is at the start of the curve and a is the distance to the mode from 
the start. Let us suppose a sample 21, #2, #...2, drawn and let its mean be 
E_=(a, + %2+...+2,)/n. Then the chance P of a sample lying between a and 2, +82, 
vg and %+ 822, ... %, and a2, + 6x, is given by 


P : an ~\p 
—=(@,+%g+...+%,) (%1 Ve... Ly\? 
P=const.xe a*"? ‘ (= — ") dx, dix2... xy. 


Now get rid of a by introducing %, as a variable and write /2 for 


NX» — Bg Tam 20 — Lye 
: 
We have 
npx,, b i ai® geN\D fa a 
my el a 2— Xe V2 %yewe A 
Pz=const.xe «¢ dz, (- a" ) ( ) : - ") daydas... dx, 
a a) > 


Put #2=lg2’ and integrate out for 22=0 to lg or x,'=0 to 1. This will introduce 
a B-function into the constant, but leave us with 


* Cf. Biometrika, Vol. xx1. pp. 171 and 173 for accordance of the curves. Their equations are given 
on p. 185, where we must write 4n—-1=p+4, or n=2p+3. The two curves have then the same first four 
moment-coefficients. If »= Y/{2(p +1) (p+2)}, then the proportional area from »=0 up to any arbitrary 
value of 7 is given by 4J, (3, p+1), where I p+1)=B, (3, p+1)/B (h, p+), B, and B being the 
incomplete and complete Beta-functions. 

+ See Biometrika, Vol. xxm. pp. 253—283, or Tables for Statisticians and Biometricians, Part I, 
pp. ¢xxv—cxlii and pp. 169—177. 


9 (ds 
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npz, 





x. ls 2p+1 yy, z.\P 
St s = U9 T...-XLy Y 
P=const.xe «@ di, (2) ( a daz... dap. 
Write /,=/3— 3, and proceeding in the same way, we find 
npXy, ]q\3P+2 /a ~\p 
> N = = & Ha.-- Xn et ; 
P=const.xe «@ dz, (- ) ( a diy... day, 
where ls = n@,, — #4 — %5 — ... — 2p. 


Continuing to repeat this process we ultimately get rid of all the variables but 
Z, and find * 
_ Uprt, En n (p+1j)—1 
we — 
P=const.xe 4 (© ) ae Ee -+(XX). 


We now put this into the canonical form for a Type III frequency curve, i.e. 


P r- r.\ P 
4°" (=) iets ccaetbendaeakueeeel (xx bis). 


1)-1 
Hence we must have P=n(p+1)—1, and P/.d =np/a, or A =a ad. a) Ba ; 


y= Yor 


ip 
Accordingly: 

Mode of %, =a ws! ds | Bae 

np 
Mean of , = My’ = aS ee At.) Pe) ae (xxi), 
I p 
a A? ahs l ye I a’ ( p+1) i 1 a 
" ca n p n 





where # and o, are the mean and standard deviation of the population from which 
the sample of n is drawn. Lastly 


: + 


* (p+) and By =3 + § By........ccccceeees (xxii). 


Clearly, if m and p are not very small, then (xx bis) will approach much nearer to 
a normal distribution than the parent population (xix). 


(7) Wecan now apply our results to particular cases. If we draw two individuals 
out of Type III curves like (xix), with the same skewness as measured by p, then if 
a and a’ be their modal distances, and 


a Ha) _ vy a v2 a. aT hos 5 a ” 7 
vss P fs =) (p+1)(3 4 Vp+) (2 a 


for these are all equivalent, then the distribution of Y is given by 
g= MT, 4 ( ¥). 


If the two individuals are taken from absolutely the same population, i.e 


. dg = 44 =, 
then 


o— -——- La © 
. += V/(p+1) 2 





Y =p—* =(p+1)* 


Ox 


* This result was published by Church: see Biometrika, Vol. xvi. p. 336. 
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Such results, however interesting in the case of experimental sampling in the 
Laboratory, where we have a knowledge of the parent population, will hardly be of 
practical service, because we should usually lack a knowledge of p, 7 and az. 


Now turn to (xx), and suppose we have taken two samples of n and that their 
2 


means are Z, and ,,’, then the distribution of Y = A (%,' —Z,) will be 
2=4 MTp,4 (V)= $ MT nipyy_y (V)....- ee eeee ences (xxiii). 
There are now a variety of ways in which it is possible to express Y. In the first 
Te , 
place P/A= ‘P | where p and a refer to the parent population, but mean — mode 
a 
ie 4! . p 7 2 
=—=%—Z, say. Again pe 2=5 . Thus we have 
Pp a@ oO," VB; 
’ — == f= @ - 9, (nm? = 
Y meg oe Po, Ee — Fn) _ 2 On — Ba) (xxiv) 
“—-z -_— VBi Cx 


Further, we need the value of the p+1 in the degree of the 7, function; we 
have : 

z Ox 4 

94+1=- > =a =_—— 

I Z-xX (%—-z 2 Bx 

Here &, %, ¢, and all refer like p to the parent population. Clearly some two of these 

quantities Zand %, Z and o,, or ®, and co, must be known, or we cannot determine 


SesunsceahSen MAAR Rae (xxv). 


aand p. We shall see later that in certain other applications p is known, and then 
probably o, is the best quantity to seek for. It might be thought that Z would be 
easy to find. It may be so, if the start of the curve can be determined, but it must 
be remembered that % is the mean measured from a definite point of the parent 
population, i.e. the start of the parent population, and this may be quite unknown, 
%— does not involve this knowledge, but the mode is not an easily determined 
character. On the whole 8; and o, can probably be most easily obtained from the 
samples. Of course this refers to cases in which the parent population is unknown, 
but suspected of having a skewness which may be approximated to by a Type III 
curve. The procedure here would be to determine to the second and third moment 
coefficients of the pooled samples, and thus obtain the best approximation which is 
available to 6, and o, of the supposed parent population. 


‘ 4n 1 
We then take m == —5, and 
Bi 2 
9 ry : 
= Yn @, —F . 
| ee oseses(XXVI), 
VPi : Cx 
and test whether the probability integral of 7, (Y) has a value sufficiently large 
to justify us in assuming that Z,’ and %, came from the same population. 

Perhaps a more useful case occurs when one sample is sufficiently large to give 
reasonable values for the constants, and we ask whether the other could have been 
drawn from the same population. In this case we may determine p and a with 
sufficient accuracy from the large sample and measure the probability of a, for the 
second sample from (xx) or (xx bis) by aid of the Tables of the Incomplete T-function. 
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Generally we may state our problem to be this: We wish to know from the 
means of two samples whether they are consistent with these samples having been 
both drawn from the same population. We have no reason for supposing that 
population follows a normal distribution, or we may have good reason for supposing 
its distribution skew. Shall we do better to assume 8,;=0 and the unknown parent 
population to be normal, or to work with the value of 8, found from the pooled 
samples? Probably with samples of 25 or 20 the latter would be the wiser course; 
at any rate, on comparison with the former method, it would give us some measure 
of the extent to which skewness might invalidate our conclusions from the normal 
hypothesis. Unfortunately we do not at present know the distribution of the 
variance of samples drawn from a Type III curve, or indeed from any skew curve. 
Had we known it, it might be possible to construct a quantity like “Student’s z” 
with the additional advantage, however, that it would possess correlation between 
numerator and denominator. 


(8) Type III curve gives the distribution of frequency for other statistical func- 
tions than that of the means of samples drawn from a Type III distribution. One 
of the most important cases is that of the distribution of the standard deviations 
(or of the variances) of samples from a normal parent population. If } be the 
standard deviation, Mz the variance of the parent population = >?, n the size of the 
sample, o its standard deviation, 2 its variance, we have Helmert’s Equation for the 
distribution of o, 


, n—2 _ no” 
y¥=% () Pe es (xxvii), 
or, expressed in terms of the variances, the Type III equation 
,7 n-3 Ney 
Y=Yo C =) 2 e@€ 2M, [dpe] ...... ieactvaanade (xxviil). 
Me 


Hence if we have two samples of size n with variances pe, we’ taken from normal 
parent populations of variance M, and M,’, the distribution of the difference 


y=" be! iL) 
Sia as 
is given by SeEMT a (VY)  ...0000000 ee ere (xxix). 
We are therefore in a position to determine whether the variances of two samples 
each measured in terms of the variance of its parent population are significantly 


different. 


‘ ; ; ‘ , n 
af the two parent populations are identical, then Y= - (u2' — pe)/Me. If, as may 


often be the case, the parent population be unknown, then the only remedy is to 
take for M, the value provided by the two samples pooled. If we know the means 
#, and Z,,’ of the two samples to be the same, this will be $ (2 + ue), so that the 
frequency of the difference will be given by 


s= EMTs nv) (" vn bs) COeereccecececceces eve . (XXX). 
He + Pe 
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If on the other hand we know that Z, is not equai to Z,’, we have to put 


, Me = $(pe’ + pe) +4 (%,’ —F,)*, 
and have accordingly 





n (412 — pe) . 
s= $ MT no (— 7 e+ 1 G,’ me =) ercccccccccccce (xxxi). 


Dr Elderton’s Tables provide the ordinates* of the above curves up to samples 
of n=25. For large samples at present we are thrown back on the Type VII curve, 
the probability integral of which is included in the Incomplete B-function Tables. 


(9) If we have a population of large size M consisting of v categories whose 
frequencies are 
m1, Mg, ... My, 
and a sample of size NV be taken giving categories of size 
a. Ma, <<. Re 


2 
only restricted by the condition S(n,) = VY, and we form the quantity 
1 


A= (m1 sa oo ("2 7 +...4 ~— eS (xxxil), 
where 7,;=m,N/M, then the distribution of y follows the curve + 
het ae i eens (xxxiii) 
and the distribution of }y?, the Type VII curve 
v-3 


~iy8 a 
3X 9 3 
’ aa o - 
Y= Yo e (4x) [d(iy*)] _ .........(xxxiv). 
Now this Type VII curve, like that for the variance, has its p known, = $(v—3), 
which can be found at once from v the number of categories in the sample. 
Further, its a, Le. its modal distance, = 4$(v — 3), and its standard deviation 
*] 


ao aan ’ 12 8 
=V4(v—1). Again, if required, 6; = Oe, and B2=3 + me 


_ 1 ° 
Thus if we have two x's, namely yx? and y”, from two samples, the distribution 
of their difference will be given by 
BEM gS HF) osorcesesesccceses .+.(XXXV). 


Accordingly we have obtained a measure of whether two x*’s supposed to be due 
to sampling from the same population are reasonably probable. Given the y*’s 
the solution is independent of any knowledge of the parent population, beyond the 
number of categories used in determining the y*’s. 

We may make some remarks on the curve in (xxxv). The standard deviation 
of a 7, curve is V2m + 1 and in our case m= }(v — 2); therefore, if Y=}y’?-—4y%, 





oy =Vv—-1=V3 (0-1) + $(v—D) = Varn + 4,3, 


1 


as it should do, since }y’* and }y* are by hypothesis due to independent samples. 


* A probability integral table of 7, (Y) accompanies this paper. 
I . m F Pp 
+ Pearson, Phil. Mag. 1899, p. 239. For properties of the x? curve, see Drapers’ Company Research 
Memoirs, Biometric Series, vit. 
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- : : . 6 
Again, B,=0, of course, since the 7, curve is symmetrical, and Bz = 3 Peer 1 ~ 
Accordingly, Bz —-3=4}(82— 3) of the $y? distribution or the kurtosis is halved, 
ie. the }y’*—4y? curve is 50°/, less leptokurtic than the parent population. 

Further, the distribution of the }* difference does not depend directly + on the 

* See above, p. 294, Equation (viii). 
+ Indirectly, o* course, it does, a point too often overlooked. Take, for example, the data for Typhoid 
Inoculated and Attacked, due to Greenwood and Yule. 











Attacked Not Attacked Totals 

Inoculated ... 56 6,759 6,815 
Non-Inoculated 272 11,396 11,668 
Totals 328 18,155 18,483 




















What is the exact meaning of the 6,815 and 11,668? Are they simply samples of the two classes 
Inoculated and Non-Inoculated, that the recorders have taken, or have they some relation to the numbers 
in the community willing or unwilling to be inoculated? If the former, they are subject to the more or 
less arbitrary choice of the recorders, and if we find x? we are finding it subject to the supposition that 
the recorders repeatedly made experiments with the same numbers. In this case there is only one degree 
of freedom in this table, or x—1 degrees, when two populations with « categories are compared. It 
seems in many respects more advantageous to treat this problem in the manner it was first investigated, 
namely as the comparison of two linear series, when the limitation on the degrees of fresdom is seen at 
once to arise and arise naturally (Biometrika, Vol. v1. pp. 250—254),. 

But suppose the numbers of Inoculated and Non-Inoculated arise from some natural division, as in 
the case of vaccination, and non-vaccination in the country at large, then our table represents an arbitrary 
sample of the total population, and there are three degrees of freedom, and this must be borne in mind 
in determining the probability of the observed result. In such a case we may or may not know the 
relative frequencies of the inoculated and non-inoculated in the population under consideration. If we 
do not, the only thing we can do is to use the observed ratio, as if it were the population ratio of the 
two classes. In the case of the ratio of inoculated to non-inoculated following as a natural order, i.e. a 
table obtained by a random sample out of a general population, where we select an individual without 
regard to whether he has been inoculated or attacked and afterwards inquire into details, the x? is simply 
proportional to the total number of individuals selected. Thus the x? for the above table is 56°234, but 
had we taken a sample of half the size it would be (subject to variation of sampling) 28-117. In other 
words the value of x? and accordingly of P depends very largely on the size of the sample, and the com- 
parison of x*’s for two tables of different totals can be made to give almost any value we please to the 
probability of the two x?’s being due to samples of different sizes from the same parent population. The 
quantity which would remain approximately the same would be the ¢?, and in comparing two tables like 
the above of different sizes to test whether they come from the same population, it is rather the com- 
parison of ¢* and @”, than of x? and y” which should guide us. 

If, on the other hand, the sizes of the two samples of Inoculated and of Non-Inoculated in a table like 
‘hat above have been arbitrarily selected, the x? will change widely with those sizes. For example, if we 
suppose the table formed by two independent samplings, one of the Inoculated and another of the Non- 
Inoculated, and then recording whether they had been attacked or not, the vertical marginal tables are 
at our choice, and for five arbitrary sizes of the two samples we have approximately: 


(a) 











ee ees 5 i eT. wer ao eee | (c) Sy Bay 
56 | 6,759 | 6,815 | | 56 | 6,759 | 6,815 | 56 | 6,759 | 6,815 














272 | 11,396 11,668 | | 159 | 6,656 | 6,815 | 68 | 2,849 | 2,917 

















9,732 | 





| 
| 
| 
328 | 18,155 | 18,483 | | 915 | 18,415 13,630 124 | 9,608 | 
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size of the samples on which x? and x” are based, but solely on the number of 
cells used in computing the y*’s being the same. 


A point here is, perhaps, worth noting as we have not seen it recorded. If we 
take two sets of NV samples each, and m;,, mz,’ be the means of the means of the 
two sets, then mz, and mz,’ will be distributed according to a Type VIZ curve, if 
the original parent population is, because in this case %, and %,’ are so distributed, 





























D ; 
(4 | 28 | 3,379 | 3,407 | | 28 | 3,879 | 3,407 
Bool bene Bae dai 
| 136 | 5,698 | 5,834 | 68 | 2,849 | 2,917 
| 164 | 9,077 | 9,241 | 96 6,228 | 6,324 | 





These tables give : 


1 | @ | ® | © @ | © 





50°135 36-998 28-117 19-6802 
‘003,578 | -003,802 | -003,042 | -003,112 | 


| x? | 56-234 
¢? | 003,042 





A similar variation of x? arises if we take two arbitrary samples of attacked and not attacked, and 
inquire as to whether they were inoculated or not. In other words, when there is no “natural” proportion 
of the sizes of the two samples which are being compared, x? will vary widely (and accordingly the value 
of P by which independence is tested) owing to the size of arbitrary chosen samples. This variation of 
x? and of P, when there is no natural proportion in the two samples (as sex for example), is often over- 
looked in interpreting results where x? is really largely determined by the size of the samples compared. 

Another important point, to which we may draw attention here, is the relation, often postulated as 
completely definite, that y?= N¢*, where N is the size of the sample. If we couple this relation with the 
distribution of 4x? as given by the equation 


= 2 1 i= 
x (3x2)? “" 3) 


Y=Yoe 
where v is the number of ceils, then since the Mean of x*=v-—1 and its variance is 2(v—1), we should 
expect 

1 Fs 2(v-1) 


ov 
Mean ¢?= yV o“g2= eT Tree (8). 


Now, if there be x columns and \ rows in a contingency table, v=«xd, but the mean value of ¢* and the 
value of o°42, even if there be no association, are not 


ONS ee oS |, Se ees ener ern. One er ec {y), 
see $17 below of this paper. In the very special case of no association and N large, they only approxi- 


mate to 
(kx-—1)(X-1)/N and 2 («-1)(A-1)/N°, 


and even then only agree with the values in (y), when x and ) are indefinitely large, but of a definitely 
lower order than N. A contingency table is hardly likely to be of practical value under such conditions, 
The fact is that when we are studying the ¢* of a contingency table taken as a sample from an in- 
definitely large population, successive samples will not have the same marginal totals and the distribution 
of ¢? is not that of x*/N. When we take two series each of definite size, and test their independence by 
a (x2, P) test, we are really dealing with what one of the present writers long ago termed “partial con- 
tingency,” but it behoves the user to state very precisely what is the origin of the totals of his compared 
series, and to remember that his P as measuring a degree of independence only applies to repeated 
comparison of series both of the same totals as the first, and that he cannot generalise as to the degree 
of dependence which would arise had he used other constant sizes. For the sake of statistical students 
the senior author of this paper believes it advisable to keep very distinct the usages of x* and ¢*, and not 
obscure a difficult topic by assuming ¢* is merely x7/N. 
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and accordingly we have for the distribution of the difference mz, — mz, the 
curve 


2= 3 MT yngin-4 (ae (mz, — mz,)) ease ...-(XXXVi), 


where M is the total number of cases and p and a refer to the original parent 
population. We are thus able to test the difference between the means of the means 
of sets of samples. 


Similarly if m,,, m,,’ denote the means of the variances of two sets of samples 
of n, each NV in number, taken from a normal parent distribution, then the distribu- 
tion of the difference of the variances is given by 

j 7 ’ 
nN m, —™,, 
2 


2= EMT yyuv-4( 2 Me 


Results (xxxvi) and (xxxvii) may occasionally be useful. 


) So Ceig  tiad (xxxvli). 


(10) Lastly suppose we have a contingency table, the number of cells being 
« XX, and a sample of size NV be taken from it, then we shall find that the mean 
square contingency, $7, of such a sample, wnder certain conditions, obeys the law of 
distribution 

y= (hep2Pre tse, 
but that ¢ is not equal to Ny nor p, to $(v—3), where v = «dX the number of cells. 
If two samples having mean square contingencies ¢? and ¢” with the same number 
of cells « x X and of the same size N be drawn under the above-mentioned condi- 
tion, then the frequency of their difference will be given by 
Y = EMT), 14 [be(P %—G)} cece eecereeceeeees (xxxvili). 
The conditions referred to will be discussed in a special section later. 

But in many cases e will not equal e’, and it is perhaps in practice a more usual 
problem to determine whether ¢” may be reasonably supposed to be a sample from 
the same population as ¢*, than to deal with y’* and xy”. The latter contain the total 
sizes of the two samples, but the ¢? and ¢” denoting mean square contingency lead 
us at once to the problem of whether the coefficients of mean square contingency 
J¢7/ + 6”) and ./ ¢?/(1 + ¢*) and so the degrees of association in the two samples 
may be considered as reasonably accordant on the hypothesis of the samples being 
selected from the same population. 


The distribution of 6’? — ¢? will be considered later. It will be found that, under 
certain conditions, it obeys the same law of frequency as the distribution of the 
first product moment coefficient pu. 


(11) There is another method of approaching these problems, and only illustra- 
tions from known curves or surfaces can tell us which method is generally the more 
effective or more suitable in a particular type of cases. The whole of our results 
depend upon quantities %,, 7’; we, we’; x*, y’*; ¢, 6, which satisfy a surface which 
can be thrown into the form 


w=ue "9% (VUp [dVdU]. 
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So far we have discussed the difference distribution V— U and shown that it is 
given bya 7’, function. We may now discuss the distribution of the ratio z= V/U*. 
Here V and U are independent and can take all values from 0 to 0. If we inte- 
grate out for them and there be M sets of V and U, we find M= wy [*(p +1) which 
determines w». Now if we consider U and V as measured along two rectangular 
axes, z = constant gives a line through the origin at slope tan~z, and if we transfer 
to z and U as variants, we must integrate keeping z constant from U=0 to o and 


7 : 
then for tan-1z from 0 to x, or z from 0 to oo. 
Thus we find w= we ttU U2rttzp = [dU dz]. 


Put (1+ 2) V=é and keeping z= const., we have 


oP 
Zz 


w= weet Ett Gta [d&dz], 


where & goes from 0 to ~. 


Now integrate out for & and we find 


wW = WoT (2p + 2) = , [dz] 


(1 + z)?*? 
I’ (2p + 2) 2? : 
=M— os ccc cccccccecccce( KXXIX) 
M(p+1)(1+2)?"* ) 
This is the frequency curve for the distribution of the ratio z= V/U for a popu- 
lation of M ratios. To find its probability integral, we have the measure P,, that 
the ratio should be greater than 2, 
7 a ets) [ove a ii 
°° 61 (p4+1)/2,(1 + 2)? 


1 
Take 1+27= ; ,dz=— 2 dy, and 


1 
['(2p +2) fl +% 
Poe Pa? la 
P.. P(p+l)o (1+y)ydy 
Bi (p+1,p+1) 
i 
es —_—_ = 1, BD. Kouukeebeckenee xl), 
Biot psi a p+1) (xl) 
, ; ‘ ac" 1 . 
or the incomplete B-function ratio for the value <2 Here 2 may be %,’/%,, or 
Zo 


He’ /pe, or x"*/x*, according to the problem with which we are dealing. The quantity 
I, (p’, 7) is that tabled in the Tables of the Incomplete B-function and from them 
P,, can be readily found. In our case we have p’ =q’ =p + 1, or we are confined 
to the “diagonal” values of that Table. 


* Cases of the distribution of V/U have already been considered by R. A. Fisher, V. Romanovsky, 
and E. S. Pearson with J. Neyman. For the purposes of the present paper we give an independent 
investigation, which throws the answer back on the Incomplete B-function Table. Fisher has provided 
a table which enables the probability of the ratio U/V to be determined by a transformation of Equation 
(xxxix), 
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But the matter would not appear to end here. What we have measured is the 
improbability that V should exceed z)U. But we want to measure a certain degree 
of the probable round z=1; we must cut off therefure an equal angle yw from the 


OU axis, if w be the angle z) makes with OV. Clearly cot w~ = 2 and tan y =-; 
0 


Vv Z*0O 8=—_« Z*Zy 








O 
| Pee nN yy Geen ar Shee 
therefore the probability that V/U <— or U/V < z is given by 
“0 
1 
I (2p +2) [2 2p 
"Wee Tay |) ee U2. 
I 1/z I? (p ) & (i + zyrr 
Taking now 7’ = eve we have 

1 

(2p + 2) fi+z 
_ Pine i Men 02! p ss 1\p ] ’ 


= P22, 

as it should, for we have cut off equal areas. 

Accordingly the total chance that V/U should exceed z and U/V exceed z is 

ee Sf D  ereeeeree (xli), 
1+ 

which may be taken as a measure of the improbability of the ratios V/U and U/V 
occurring. 

The Tables of the Incomplete B-function provide Q,, up to p = 50, and a small 

) 4} : zq Uf 

portion of them are repr duced here for comparison. Clearly we need the argument 
only up to 0°5. (See Table II.) 

(i) Two means 2, and &,’ of two samples of size n from a Type VII parent 
population. 


Q. UE =22) 1 {n( p41), a(pt))} — -.ccccesc..e008 (xlii). 
nin rey 
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Thus unless the p of the parental population is known, it has to be approximated 
to from the samples. Our test would determine whether it was very improbable 
that the two samples were drawn from the same (a, p) curve. 


(ii) Two variances yg and pe’ of two samples of size n from a supposed normal 


population. 
Qua’ Ju, = 2! ae {$(n—1), $(m—1)} ..... seeeaeunad (xliii). 
1+ o'/me 
Our test would determine whether it is likely that both samples were taken from 
normal populations, of the same variance, but not whether those normal populations 
had the same mean. 


(111) Two x*’s with the same number of cells v in two samples. 


Q ayy2 =2F 1 = {$(v—1),$(e—1)} ..............4 (xliv). 


1+x x" 


(iv) In the case of the mean square contingency we have, under conditions to 
be discussed in § 16, 


Qe greg? = 21 Te ee a) eer (xlv). 
1+ ¢/e¢? 

If e’ =e, we have Q 42/2, but it would be clearly better to be able to provide Q¢2/¢2 

when e’ is not equal to ¢ (see p. 304 above), and this will be considered later. 


(12) The previous discussion has indicated the necessity of a probability 
integral table for the 7',,(#) curve. We may write 


Sin (a) = eee ee ere suaskawest (xlvi). 


zx 
70 

A table of S,,(«) has been computed by Miss David (see below). The tabled 
value being S,, (), it follows that }(1+«) the usual probability integral = "5+ S,, (x), 
and $(1—a@)=°5—S,, (v). Hence the probability that an observation will lie outside 
the limits + is given by 1—2S8,,(«). Considering the case of y” and x*, the 
difference x’ — x* and the ratio x’*/x? tests will give equal probability when we 
have 

IN a f ; eae pt ‘xlvii 
1- 28) (y 2) (2X —4,)7),=27 1 {§(v—1),$(v—])} ...Calvii), 
. 1+ x7) 

where v is the number of cells under consideration. If we now give v values from 
2 upward, and to y”*/y? =X values from 1 to 100, we are able by means of Table IT 
to find the right-hand side of the equation. Hence by Table I to determine 


_ pee x”) = S, 


S, (2) (4X 9) (4 (X— 1) x7}, 


2 (v- 
and thus x? and y”=A,*. The curves thus obtained are plotted in Fig. 3. Both 
the arithmetical work of computing their co-ordinates and the draughtsman’s work 
in producing the diagram were very laborious; the curves asymptote to the axes of 
x? and y” being of course symmetrical. They are nof rectangular hyperbolas, although 
they might well be described as “ hyperboloidea.” 














sey | 


DIAGRAM GIVING CURVES 
OF EQUAL ioe ery 


S & xx. 





to 
fo} 


_ 


SCALE OF xX’? 


10 




















Oo aul 
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et 2 Se I T a 
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Fig. 3. 


When the statistical coefficients by which we enter this diagram are not x” and x? 
them by the following as coordinates : 
In case of: 


, we must replace 


Variances from a normal curve py and py’: 
Nug/=*, nuo’/=*, where n is the common size of the two samples, and =? the variance 
population or its substitute. 
Means from a Type III curve %, and Z,’ 
at 4 & : : r . — 
—— ~~, —, where n is the common size of the two samples; = the standard deviation and £, 
ve) a . 
belong to the parent population or its substitute. The v in both these cases is =n. 
Two mean Square Contingencies ,° and $,'*: 
3 =) ¢,*, where p is the possible range of ¢.2=-1, if k>A, v= 


of the parent 


2p, +3, while p; and p, are 
given by Equation (lviii). 
Two Correlation Ratios y? and 72 from a surface of zero correlation: 
(N —n-—2) 7°, (N-n-2) 7, where N is the size of the sample, n the number of arrays, and v=n. 
Two Correlation Ratios 7? and y’* from a surface of finite correlation: 
2p,7*, 2p.n’*, where p, and p, are given by Equation (Ixxiv), and v=2p, +3. 
Two multiple correlation coefficients R? and R”; 
Replace 7 and y” by R? and RK’ for the above two cases 
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Given a value of v, say v = 10, then for values of y”? and x lying between the 
curve marked v= 10 and its asymptotes the ratio gives a lesser probability than the 
difference. In other words, the difference test is a more stringent test than the 
ratio for all points (y*, x) lying inside a given v-curve, that is to say a lesser 
probability of the given hypothesis being correct. Since the area inside the curve 
is always far larger than the area outside the curve—i.e. between the curve and its 
asymptotes—it would thus appear that the difference test will as a general rule be 
likely to be the more stringent. But by simply noting the position of the (y*, x”) 
point on the diagram (Fig. 3) it will be found possible to determine which is the 
more stringent test of a given hypothesis in any particular case. 


It may occur to the reader that if the P’ or the P corresponding to y” or x?, or 
indeed both be so small as to render it improbable that either of the compared series 
have a common origin, it is illogical to test whether y” and y* have any relation. 
But a little consideration will show this is not so. For example, let C, and Cz be 
two processes of inoculation, and let the two processes be applied and the numbers 
attacked under the two processes be recorded, in each case against a non-inoculated 
control. Suppose we find in each case from its y* a very slender possibility of the 
inoculated and control series, being samples of the same parent population, we 
conclude that inoculation in this matter is of service. But granted this we are at 
liberty to inquire further whether the two processes of inoculation produce results 
so divergent that it is unlikely that they themselves could arise from the same 
population of inoculations. We are really testing whether one or other process is 
the more effective. Generally our problem will turn on the probability of a difference 
$y" — $x? or a ratio x*/y” greater than the observed occurring. Since no value of 
x” is 


probabilities of 7’? and y* themselves may be. It may be very improbable that the 


“impossible,” this probability is a perfectly definite one whatever the actual 
x’ sample belongs to a parent population A, or that the x* sample belongs to a 
parent population B, neither can be impossible, and accordingly there is no logical 
reason to hinder us from testing the probability of the combined difference or ratio 
occurring. All we must be careful about is the interpretation we give to our result. 


(13) Construction of Table I. 
This table was computed in the following manner by Miss F. N. David. 


It is known that* 
AK, (x) _m 


— Kn (@) — Kinsa(@)  . 20000000000. ...(xl viii), 
da x 
: : 1 aK, (2) ‘ 
while T'., (@)= ae T)\ ceetseeeeeessceeeeeeeees (xlix). 
Vor 2" T' (am + 4) 
Substituting (xlix) in (xlviii) we have, after a slight reduction, 
2m a aT,,(2) 
Tea (0) = T(x) — See hei c iene eaukeb en 1), 
moa (4 2m+1 °° («) Q2n+1 dz« ( 


an equation providing the differential coefficient of 7’, («). 


* This follows at once from the equations in Biometrika, Vol. xx1. pp. 181 (footnote) and 184. 


Biometrika xx1v 20 
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Integrating from « = 0 to #, we find 


2m [" a 4aT,,(«) 
« , => S d , = = dx 
Sma (#) 2m+1°™ (*) Jo 2m+1 de “ 
and integrating the last integral by parts, we conclude that 
. oe x : 
Sm41 (a) = ps (a) —_ Im+1 Ti. (x) oe ccccccccceeescceses (li). 


Now since Dr E. M. Elderton’s Table gives 7’, (2), we can from a knowledge of 
S» (x) find S,,,1 (a), and thus by repeated use of (li) build up the table of S,, («). 
Since we require m to advance by 0°5 intervals, we need to find Sj; (w) and So (#). 


« 
Now Sy (a) = [ Ty(x)da, and Ty(«)= $e; 
i J0 
accordingly 


Sy (a) =4(1—-e~*). 


Thus Sj («) could be and was calculated from Glaisher’s Table of the Exponential. 
From this value of S; («) all the values of S,,(#) for m= 15, 2°5,... 11°5 in Table I 
were computed by (li) in succession. 


x 
The value of So(0)= | T)(x)dx is not so easy to determine, because 7) (x) is 
0 
infinite when «=0, and no quadrature formula is applicable. It was therefore 


resolved that S; (#) = [ 7, (x) dx should first be found by quadrature, and then 


So (x) = Sy (a) + xT, (x) 


be found from the result, since Lz» {#T> («)} = 0, which surmounts the difficulty, and 
thus So(«) was determined. The values of S,(«#) obtained by quadratures from 
7’; (x) were computed by Mr E. C. Fieller, and appear in the column under m= 1-0 
of Table I. The ordinates were taken at intervals of 0°02 from #=0 to 0°6, 0:1 of x from 
«2=0'6 to +0 and after z= 40 up to «= 18°5 by intervals of 05. The work was 
laborious, the ordinates being calculated to eight figure accuracy, but the areas, 
given to eight figures, were scarcely to be trusted to the last digit, where there 
might be an error of 1 to 2. Thus the seventh decimal might sometimes, but rarely, 
be in error in a unit. For this reason Miss David’s Table computed to eight figures 
was cut down to six for publication. Although the values of S,, (#) for integer values 
+ 0°5 of m could be obtained with any desired degree of accuracy, those for integer 
values only depended on a quadrature, which it was difficult to make reliable to 
eight decimal places. As a matter of fact Miss David’s eight figure table was used 
for all the illustrations which follow, but as linear interpolation was employed* as 
adequate for the purpose we had in view, we should have got nearly the same final 
results from the six figure table now published. 


Those who have occasion to use the table must be careful to note that from 
x =0 to 40, the table advances by 0°1, but from «= 40 to e=18°0 by 0°5, and this 
change must be borne in mind when interpolating into the table. 


* In a few cases where the value of x led us to the top of the table higher differences were introduced. 
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If we are dealing with v categories, m= }(v — 2), and v is the number of cells 
indicated by n at the head of the column. 


(14) Construction of Table II. 


The probability of a ratio, eg. y”/y*=A, is given by Equation (xlvii), and 
demands a table of J 1 {}(v—1), }(v-1)}, where I,(p, q) is the incomplete 
1+A 
B-function ratio, or 


x yr) ( = xt 1 dx 
I,(p, D= ~ eo ccceaneeu more (it 
2? (1—2)t*dzx 


0 


An important relation between two kinds of B-function ratios may be noted here, 


Ie(p, p)=4 {1+ Le (4, p)}  .........2000 Sg veseweRe (liii), 
where x’ = 4(#— 4). 
In the actual table* we should not find J, (4, p) but only J,” (p, 4). The relation 
between them is 


Ty (4, p)=1 md Ty-w (p, 4); 
thus we modify (Iii) and put 


Tn (p, P) = 1 — FE e(P, HT -<-200-c0ceeroeees (liii bis), 
where 2’ = 4(#—4)%. 
In actual practice this relationship may be of considerable value as transforming 
a value of the incomplete B-function from a part of the table where interpolation is 
difficult to another part where it is easier. 


Those who wish to find J, (p, p) can either use the present paper’s Table IT or 
use the values of P,(n) which have already been published in the Tables of the 
Probability Integral for Symmetrical Curves issued in Biometrika, Vol. XXII. 
pp. 274—283, or in Tables for Statisticians (Part I1), pp. 169—178. In this case 
P, (n)=4 {1+ Ty (4, p)} is actually provided and equals J, (p, p), where 


w=4(14+V2’). 


The present Table II renders the discovery of the value of J, (p, p) very easy. 
It has not been carried further than « =°50, for \ only takes values from 1 to ©. 


* Now at press, and shortly to be issued. 
+ For example, consider I 7 (6, 6); its value taken out directly is *921,775,209. Now x=°7, x’="16, 
and 1 — .x’=°84, thus the table gives 
I.g, (6, 3) = "156,449,582. 
Hence 1 - 4g, (6, 3) =1 — 078,224,791 =-921,775,209, 


which is the value of I., (6, 6) found directly. 
t Thus in the example of the previous footnote, we must look ont under § (n — 1)=6, and x’=-16, and 


we find -921,775,2=TI., (6, 6) to seven figures. 


20—2 
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Table II was extracted from the Incomplete B-function manuscript by Miss 
M. T. Beer, and, as that table does not go further than 10°5 for the half-unit 
intervals, the values for p= 11°5 were computed by Miss Brenda Stoessiger de novo 
in order that the range of Tables I and II might be the same. We have cordially 
to acknowledge their aid as well as that of Miss M. Kirby for the diagrams, in 
particular for Fig. 3. 


(15) Illustrations of the Method of using Tables I and II, and of the value of 
Fig. 3. 
TIilustration (i). The following tables are taken from a paper by K. Pearson: A 
Study of Trypanosome Strains*. 
TABLE i(a) or MEmorr. 


Length of Trypanosomes in Microns. 








j | | } 
: 12and | ,. | : 17 and 
| Goat as Host | under | 13 | 14 15 | 16 | Qver | Totals | 
asa _— ons _ = —|- oe Se — See 
Wild G. morsitans Strain 37 55 | 60 | 32 12 | 4 200 
Wild Game Strain are 17 37 | VY 38 26 9 200 


TABLE i(6) or Menor. 


Length of Trypanosomes in Microns. 








- 12and | ,. BS aaa . 17 and ‘ 
Dog as Host | ieesdie | 13 14 15 | 16 aver Totals 
| 
. ; oy, mi 
| Wild G. morsitans Strain 17 | 34 41 40 19 9 160 
Wild Game Strain om 12 3 Ve 50 24 6 180 
SESE SS 





If we apply the method of Biometrika, Vol. viI1. pp. 250—254, to ascertain whether 
the Wild G. morsitans Strain and Wild Game Strain are probably samples from the 
same population, we find y’*= 17:216 from Table i(@) and y? = 4°745 from Table i (6) 
leading to P’ =0042 and P =*4499 in the two cases respectively for six cells. From 
the goat as host we should probably argue that the two strains of trypanosomes were 
different, from the dog as host that they were the same. While the y’? for the goat 
is very improbable, we must remember that it is not impossible. Two possibilities 
now arise: (a) the two strains are not differentiated by their hosts, (b) the two strains 
are differentiated by their host in the same manner. Are the two x*s compatible 
with each other on either of these hypotheses?) We have 

2 


4x" — }y? = 6355 and x x = 3°6282. 


Biometrika, Vol. x. 1914—1915, pp. 117—118. 
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What do our two tests give us for the probability of compatability in these two 
x*’s? We have for the difference test 
Py og _ yg =2 (5 — S82 (6-2355)} = 2 [5 — 493,187} 
0136, from Table I, 
= 27 916; (2°5, 2°5) = 2 x -0918,7173 
= °1837, from Table I. 


Fig. 3, p. 308, indicates at once that with y’* = 17-216 and y?= 4-745, our point 
is very considerably inside the curve for v= 6, or without working out the numerical 
results we know that that difference test will be more stringent than the ratio test. 
Clearly the ratio test gives us a moderate probability of either (a) or (6) being the 
fact, but the difference test suggests that neither hypothesis is correct, or that goat 
and dog react on the trypanosome strains in different manners. This is in accordance 
with the P’ and P found in the first place for the two tables, but the ratio test being less 
stringent obscures the first impressions drawn from P’ and P. This particular illustra- 
tion was taken without any knowledge of what the tests would lead to. A similar 
example, with the y’? smaller, might have made it less easy to draw any definite 
conclusions from P’ and P, while Py a 1? and Qye x2 might one or both give rise 


d 
tt 


to conclusive results. 


Illustration (ii). To illustrate the last remark we will take two further tables 
from Pearson’s Memoir on Trypanosomes. They are as follows: 


TABLE ii (a). 


Length of Trypanosomes in Microns. 





| 
: F 11 and ‘ : ~ | l6and | » _ 
Goat as Host under 12 13 14 15 | over Totals 
Wild G, morsitans Strain | 16 | 21 | 55 | 6GO | 32 | 16 200 
| Mvera Cattle Strain ... | 5 | 14 22 26 | 19 | 14 100 | 
: i 


TABLE ii (0). 


Length of Trypanosomes in Microns. 


| 11 and wong ‘i 16 and | », 
Dog as Host eniier i2 | 18 14 15 wear Totals 
| i 
. as 
Wild G. morsitans Strain 3 | 14 | 34 | 41 | 4 28 160 | 
| Mvera Cattle Strain... 3 ll | 27 30 2 4 8 100 | 








The x? for Table ii(a) obtained with a view to testing whether the Wild G. 
morsitans and the Mvera Cattle strains could be samples of the same trypanosome 
population = 5°468, rendering for six cells a probability P=°3646, or we may say 
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the Goat as Host cannot be considered as distinguishing between the two strains. 
We now turn to Table ii(b) with the Dog as Host and find y’*= 6391, with the 
probability P =-2728. The probability is somewhat less, but far from sufficiently 
less to enable us to say that the Dog as Host will distinguish between the two strains. 


We can now ask on the basis of both tests if it be indifferent whether the differ- 
ence between the two strains be tested on Goat or Dog? 


What is the probability in fact that, in the case of these two strains, the Dog 
results might have been obtained from the Goat or the Goat results from the Dog? 


We have for the difference test 


ll 


Py ae a” 2 {5 — 8,(-4615)}, or, from Table I, 
2 {5 — 096,251} =-8075. 


Again, for the ratio test, since y’*/y? = 1:16885, 
Q = 20 sei07 (2°5, 2°5), or, from Table IT, 
= 8682. 


ie 


We see that from either 


test there was high probability of the goat or dog as 
host being indifferent, but 


the difference test gives slightly the more stringent 
result as Fig.3 d priori indicates it must do, although the (x, y’*) point is not far 
removed from the v= 6 curve where the two tests give equivalent results. 

We might draw from Illustrations (i) and (ii) the conclusion that when the 
strains are sensibly identical the host is indifferent, but when the strains appear to 
be different one host may give a more marked reaction than another. 


Illustration (iii). Table iii(a) below was obtained from the schedules of 
Pearson’s inquiry into the condition of the Polish and Russian Jew immigrants into 
the East End of London. Table iii(b) was adapted from Table VII, p. 255, of 
Franz Boas’s work Descendants of Immigrants, New York, Columbia University Press, 
1912. The problem to be answered is this: The distributions of Cephalic Indices of 
the Jewish children born in their adopted country and those born in their land of 
origin are significantly different. Can this difference be attributed to the same 
causes in England and in America? 


The tables are as follows: 


TABLE iii(a). (PEARsON’s Data.) 
Cephalic Indices (Central Values). 


} " — 
| 


Male Jewish Boys 
aged 6 to 15 years 


Born in England 
Born in Eastern 
Europe 











| | 
iU YY vs - " - - | bs rs . - co — 
spas 77°45 | 78°45 | 79°45 | 80°45 | 81:45 | 82°45 | 83°45 | 84°45 | 85°45 | 86°45 | 87°45 pee 
| | | | 
ke |, 20 13 20 18 31 24 29 20 19 8 7 1] 
>. Boe 3 2 13 7 7 15 13 11 13 4 12 
t ( \ 
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TABLE i(6). (Boas’s Data.) 
Cephalic Indices (Central Values). 








Male Jewish Boys | Under) 7-.; | 73.5 | 79-5 | sos | 81-5 | 825 | 835 | 845 | 855 | 86-3 | 87-5 —_ Totals | 
| i 








aged 6 to 15 years 77 : | 
| 


Born in Eastern 5 
Europe... 8 6 10 | 23 | 40] 47 | 87] 92] 93 | 105 ak 82 | 116 | 793) 


Born in America 66 48 121 155 248 | 263 305 289 244 192 | 140 | 69 119 | 2259 | 

















The x? of Table iii (a) is 27-907 corresponding to a P of 0057, and the y”? of 
Table iii (b) is 257°399 corresponding to a P’ < 000,0001. Thus the chance of the 
distribution of the Cephalic Index of Jewish boys born in England being the same 
as that of Jewish boys born in Eastern Europe is small; the chance that Jewish 
boys born in America have the same distribution of Cephalic Index as that of Jewish 
boys born in America is vanishingly small. Boas attributes the difference for America 
to the influence of the American environment causing the head shape of Jewish 
children born in America* to approach the Gentile value. Pearson supposes it may 
be due in the corresponding English case to some admixture of Gentile blood. 
Whatever the origins of the difference of y*’s, we may ask how far is there any 
likelihood of the differences being due to a common cause. In other words, if we 
took samples of the children of immigrant Jews before and after immigration, what 
is the chance that two samples will have a difference in their y*’s equalling or 
exceeding that observed? The number of cells is 13, and a recourse to Fig. 3 shows 
us that the point (y”, x”) lies well within and away from the v=13 curve; the 
difference method will therefore be far more stringent than the ratio method. 

We have $y —3$y7=114746 and 


Pye. Ly? = 2 "5D = S55 (114°746)}. 


But Table I shows us that S;.5 (18) = "499,989 and S;.; (114746) must be much 
nearer ‘5 than this, or 
P, 5 .2< 000,022. 
2a. “ae: 
Again, y"*/x? = 9°2234, and 
4] 9). 9 = Tro 9034 (6, 6)= 2T os67 (6, 6) 
XIX 
= 000,534. 

Both tests indicate that it is very improbable that the cephalic index divergences 
have the same cause in America and England, but the difference test is far more 
stringent. 

Thus far we have merely applied the (y*, x) test and drawn an apparent 
conclusion from it, but in doing so we have really overlooked the warning given in 
the long footnote to p. 302. While in the case of the trypanosomes in Illustrations (i) 
and (ii), we have been dealing with total frequencies of much the same order in both 

* It is important to note that the mere residence in America is not supposed to modify the head shape 
of children coming to America. It is the fact of birth in America which is credited with the change. 
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the two sets of tables, so that WV’ and NW of y” and x? will hardly affect the result; 
in the present illustration Boas’s total nuiaber of individuals is nine times Pearson’s. 
Hence, if his proportions remaining the same were reduced to Pearson’s total, his 
x’* would be 28600 and $(y"—-x*)=0°3465, giving S;.5(03465) instead of 
Ss.5 (114746). We should thus reach 


P, n_pa= 9148, 


or with a high degree of probability conclude that the difference between Pearson’s 
and Boas’s Jews and Gentiles could be attributed to a common source. 

We do not say that the process here adopted is wholly legitimate, but it does 
indicate the need for caution in applying the (y”, yx’) test in either form*, and 
suggests that a (”, ¢*) test may be better applicable when the marginal totals are 
so different and so arbitrary t. 


Illustration (iv). The matter of the preceding illustration may be pursued in 
a somewhat different direction. The cephalic indices of Jew and Gentile are 
markedly divergent. If we take as our Gentiles two such closely allied races as 
English and Swedish, with an almost identical mean index, will either of our 
tests suffice to indicate a marked difference between the y*’s for the two series ? 

Two such series are shown in Tables iv (a) and (6). Table iv (a) is taken from a 
paper by Nathaniel O. M. Hirsch, entitled: “Cephalic Index of American born 
Children of three Foreign Groups t.” 

Table iv(b) is based on Pearson’s data for the Jewish Children of East London, 
and on his data for English School Children. 


TABLE iv(a). (Hrirscu’s Dara.) 
Cephalic Index (Central Values). 


| | 














Males born Under! ., - oe see ot m a ¥ p 
in America | 74 pos | 8 | ee | eS | ee | ee | See 
i | | | 
; Sy ET 
Russian Jews 5 2 4 11 21 | 34 | 46 | 65 | 
Swedes as 17 19 17 19 18 SM i wis 
Ae ee ee ee ae 
PLS einige te ae = 7 
Males born 187 and 
: +5 | 825 | 88:5 | 84 55 | 865 | ! 
| fie Beelae 81°5 82°5 83-5 84°5 | 85°5 86°5 ee Totals 
Russian Jews 53 52 49 | 34 27 17 14 434 | 
Swedes — 17 10 9 5 4 4-1 § 197 | 





Here x? = 1341757, giving a P for 15 cells < -000,0005, and there is no 
practical probability of the two samples coming from the same population. 


be Qy/2/x2= 2T-4g3g (6, 6) = ‘9670, the difference test being the more stringent. 

¢ The ratio of numbers born in the adopted country to those born in the native land is hardly a 
“natural” one; for England it is 221 and for America 2°85. 

{ American Journal of Physical Anthropology, Vol. x. 1910, pp. 79—90, Table I, p. 80. 





| 
| 
| 
{ 








| 
| 
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TABLE iv (6). (PEARsoN’s Data.) 
Cephalic Index (Central Values). 





| Males born Under 79-45 80-45 | 
| 























| in England | 7595 74-45 | 75°45 | 76-45 | 77-45 | 78-45 
i 
Eastern Jews | — | — | 5 7 | 10] 13 | 20] 28 
English ... | 146 | 86 | 167 | 226 | 272 | 342 | 266 | 230 
| | | 
eae 86-95 
| Males born | 81-45 | 82-45 | 93-45 | 84-45 | 85-45 | 86-45 | and | Totals 
—" over 
ae en cae ae See eS ete — 
Eastern Jews | 31 | 24 | 29 | 20 | 19 | 8 | 18 | 932 
English... | 188 | 145 99 61 39 | 25 21 | 2313 





Here y’* = 2346659 for the 15 cells, and again P’ is < -090,0005. Thus both series 
of data are in accord in indicating that the Jewish male child differs essentially in 
head shape from either of these series of Gentile male children. But a new 
problem arises: Is it indifferent whether the Gentiles considered are English or 
Swedish, what is the probability that y* and x? could resuit from two samples 
drawn from the same Jew-Gentile population ? 


We will apply first the ratio test. Here y’*/y* = 1-748,954, and we have, since 
v= 15, and 4(v—1)=7, 


ajg= 21 suas (7, 7) ="8077. 


Q 


Thus on the basis of the ratio test, it is not at all improbable that y’* and x? could 
have arisen from the same population, ie. it is indifferent whether Swedish or 
English boys be compared with the Jews. 


Now let us consider the difference test. We have 4y’*—4y?= 50-2451, and 
4(v—2)=6°5. Hence 
ae 
4x" —4x° 
Now 50°2451 is outside the limits of our Table I and &%5(18)=-499,97663, so 
S¢.5 (50°2451) has a greater value than this, and we can only say that 


Py a yy2 < 000,047. 


= 2 {5 —So.5(50°2451)}. 


In other words, we should conclude that the difference test strongly points to a 
divergence between the use of English and Swedes as the Gentile factor, and this 
would certainly be in accordance with the views of anthropologists, and in 
particular craniologists. We thus see that the greater stringency of the difference 
test has led us to a result more in accordance with fact than the ratio test, and 
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this might apparently justify us in granting it a position at least alongside the ratio 
test as a statistical method. But here again: Is any conclusion legitimate based 
upon the series to be compared being as arbitrary in size as the four series of these 
two tables? These sizes are perfectly arbitrary in the two cases, they were deter- 
mined by the data each observer chanced to collect, and not by any “natural” pro- 
portions. All our analysis tell us is: that if a long series of further experiments 
were made, always with the same totals for the four series, we should have 


frequencies determined by the above P, , ,.. and Q.,.,. But what if we 
, 3x” —-3x* x xe 


sacrifice the increased accuracy obtained in the case of Hirsch’s Jews and Pearson’s 
English and reduced all four series to a common total M? Pearson’s formula of 
1911* gives 


9 


(6-9 

‘2 ae 

. ONT) BE | 
N+WN’ 


Here the part within curled brackets consists only of proportional frequency and, 
neglecting influence of random sampling, would remain unchanged if NV and N’ 
were modified. Accordingly, if we multiply each x? by 


(V+ N’? MM’ 


—swarr COX 7 a? 

NN (M+ M’'y?P 
we shall reduce it to what would arise if we had the series M and M’ instead of 
N and N’. As there is no reason whatever why we should not take as many Jews 
as Gentiles, we may put M = M’, or the multiplier is(N + V’)?/4NN’. For Hirsch’s 
data the multiplier is 1°16424, and for Pearson’s 301753. Thus we have 

ux” = 156°2127, ux” = 708'1 139, 
giving $y’* — $y? = 275°9506 and y”/y* = 453303. These lead to 
Py 2_yy2 = 1 — 28e5 (2759506) < 000,047 


and much less; and 


Q vayy2 = 2T 180738 (7, 7) = 007,787. 


Both prebabilities now oppose the suggestion that we are merely comparing Jew 
and Gentile, they indicate that there is a real difference between Jews compared 
with Swedes and Jews compared with English. The difference test is, however, 
much the more stringent. 


Illustration (v). We may now turn to another form of application of our method, 
namely to judicial statistics. We take our data from Judicial Statistics, England 
and Wales, 1925 (Criminal Statistics), Table VII, pp. 68—69, and 1930, Table VIT, 
pp. 56—57; published by H.M. Stationery Office in 1927 and 1932 respectively. 


* Biometrika, Vol. vitt. p. 252. 
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Males convicted in England and Wales in Assizes and Quarter Sessions, 
by Age Groups. 
TABLE v (a). 
Crimes against the Person. 
] 
| Y Under 16 and 2land | 30and | 40 and Over Total 
aie | under 21 under 30 | under 40 | under60 | 60 — 
_ ra i SUA ma 
1925 12 137 369 318 | 297 | 45 1178 
1930 6 124 | 326 259 257 36 1008 
ae See s oe eres aaah 
| 
| Totals | 18 261 695 577 554 81 2186 | 
| 
TABLE v (0). 
Crimes against Property with Violence. 
= Under 16and | 2land 30and | 40and Over ms 
fe | 16 under 21 under 30 | under 40 under 60 60 a 
|_ * 2 Heereen Hen ited 
| | | | | | | | 
|} 1995 | 19 | 591 1059 423 287 } 47 2426 | 
| 1930 28 877 1389 506 | 327 | «(61 3188 | 
| : eos =, | wee: he 
Totals | 47 1468 2448 929 614 108 5614 





Superficially it would appear that Crimes against the Person have decreased at 
each age, and Crimes against Property with Violence have increased at each age*. 

The x? for Table v (a) = 20207 indicating a value of P for six cells of 8460; the 
samples for the two years might accordingly have arisen from the same population, 
or we cannot by this test assert a fall in the five years of Crimes against the 
Person. 

Turning to Table v(b) we have y”*=105304 with P’ =0626; this is not 
absolutely against the 1925 and 1930 results being samples of the same popula- 
tion—if they were, one sample in about 17 would give a greater discrepancy 
between the two years than the present one—but it does not like the P of the x? 
of the first table suggest no change in the intensity of crime for the two years. 

We may now turn to the psual secondary problem: Is it likely that such 
changes as are exhibited in the two tables are compatible with a common origin? 
We ask if the x” and x* could arise from sampling from a common source. We do 
not define this common source; it may be that both crimes against the person and 
against property with violence at each age are decreasing or are stationary or are 

* It is to be noted that the data pay no attention to changes in the population of each age group in 
the five years. 
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increasing. None of these possibilities is definitely ruled out by an overwhelming 
improbability, but some of them are not very probable in either one or other case. 
We have to consider whether y”* and y* are improbable as a result of sampling from 
a common population. 

Our Fig. 3 again shows that the difference test will be the more stringent, but 
we are not so far from the v= 6 curve as to believe the two tests will differ much 
in any inference to be drawn from them. We have at once, 

P, m_y19 = 2 {5 — Sa (4'25485)} 
0646, from Table I. 
Again Qe: 2=27 1 (2°5, 2°5) 
1+5-21126 
2116100 (2'5, 2°5) 


= ‘0942, from Table II. 


The difference test is again more stringent than the ratio test, the former gives odds of 
16 to 1 and the latter of 10 to 1 roughly against a common basis for the two tables. 
On the whole we should probably conclude that the changes noted might not be 
attributable to a common source, but we should not venture to be dogmatic about 
such a conclusion. It will be noticed that the Py yee 2 probability is almost the 
same as the P’ for 4”, or the difference test pays here little attention to the 
table in which there is a high probability of the two series being samples of the 
same population. 


Illustration (vi). We will take further data from the same source, and consider 
whether the changes in the age distributions of those convicted of simple larceny 
in the years 1925 and 1930 can be contributed to some common cause in the case 
of the two sexes. The Tables vi(a) and vi(b) are taken from the Judicial 
Statistics, England and Wales, Table X, p. 81, for 1925, and Table X (A), p. 70, 
for 1930. 


Sea and Age of Persons convicted of Simple Larceny in Courts of Summary Juris- 
diction (including Juvenile Courts), England and Wales, 1925 and 1930. 


TABLE vi (a). 








Males. 
ee i | | | | 
Ye: Under | Il4and 16 and | 21 and 30and | 40and | Over | motais | 
| car | 14 | under 16 | under 21 | under 30 | ufider 40 under 60 | 60 — 
| | | | 
| 1925 | 998 726 2682 4786 2775 2352 | 347 | 14496 | 
| 1930 | 334 | 588 2662 1937 3273 | 2474 | 390 | 14658 | 
| =a | 
| Totals | 1162 1314 5344 9723 6048 4826 | 737 29154 
EE Oe 
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TABLE vi (bd). 





Females. 
i = tates J | 
| — Under | l4and l16and | 2land | 30and 40 and | Over Totals | 
14 | under 16 under 21 under 30 under 40 under 60 60 —— 4 
| 1925 59 54 172 460 489 537 75 1846 
1930 | 17 36 123 455 516 618 |} 61 1826 
Totals 76 90 295 915 1005 1155 136 3672 








The main feature of the two tables is the decrease in juvenile and the increase in 
adult thieving. Is the source of this the same for the two series* ? 


The xy” for Table vi(a) is 272°6336, which for »=7 connotes a probability 
P’ < -000,0001 for the two years being samples of the same population. The x? for 
Table vi(b) is 42°7161, connoting a probability P < -000,0005 for the two years 
being samples of the same population. Thus in the case of both males and females 
there has been a most significant change in the age distributions. We then turn 
to the problem of whether this change can be attributed to the same source in the 
two sexes. We have 

x’?/x? =63825 and fy?—4,x7=11495875. 


Turning to the ratio test first, 


Q {2 ]o2 


27 assa5 (3, 3 
x2/x 13546 ( ) 


0403. 


On the ratio test accordingly the odds are about 24 to 1 against two such values, 
x’? and x’, occurring, if there were a common source. We should say therefore that 
it was unlikely, but not excessively improbable that the age changes in larceny 
were the same for the two sexes. 


We next take the difference test, 
Pon 1.2245 —Se5(11495875)}. 


The value of this S,; function lies outside our table, but we can say it is con- 
siderably greater than S2.;(18) =°499,9996, or we have 


Ps sng WOCUOIS. 
M- 38x 


In other words the difference test shows that the probability of the changes in the 
two sexes being due to a common source is so vanishingly small that we may safely 
assert that they are not so due. 

* The “source” or “sources” may not be changes in the economic or moral state of the population : 


instead of being of sociological origin the source may lie in police regulations, or in juridical changes; we 
hazard no suggestion. 











~ ~ ~ ~ ~ QP re we nd | me | > > | > J j 
Group 48 49 | 50 | 51 52 | 53 | 55 | 55 | 56 | 57 58 | 59 | 60 | 61 | 62 | 63 | 
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Thus the greater stringency of the difference test leads us to a far more definite 
conclusion than the ratio test does. It may be noted that the marginal totals in 
Illustrations (v) and (vi) are not the arbitrary sizes of samples, they are the actual 
populations of criminals caught and convicted and we cannot modify their 
numbers. 


Illustration (vii). We will now apply our methods to a different type of 
investigation, namely to testing whether the means and variances of small samples 
are differentiated. 


The following data are drawn from Dr M. H. Williams’ measurements of boys 
aged 12 in rural schools in Worcestershire *. 


TABLE vii. 
Central Heights in Inches of 12 years old Worcestershire Schoolboys. 


| | | | > | ; in ae cg | | “] 


School # l 3 2 l 4 3 l l 

School / 2 3 4 1 4 _ ] 
Aggregate oles oe 2 1 | 8°5 | 22°5 | 24°5 | 27 | 45 | 50 | 38 | 32/185) 4 6 5 
Aggregate less Hand F | 2 8°5 | 19°5 | 20°5 | 23 | 41 | 45 | 31 | 31/175) 3 | 5 5 
Combined # and F ... 1 3 4 4 5 7 1 1 l 1 





The means and variances of these groups are as follows: 


| | 

Mean Variance By | 

} _ ; | a | 
School # _ bse 54-0000 7°291,667 
School et: — 54”°6875 4°006,510 — 
Aggregate ut eis 54”°°7 105 6°539,012 “0085 
Aggregate less E and F 54”°7569 6°128,3283 “0384 
Combined Zand Ff’ ... 54°34375 5°767,2526 0163 


If we have no information as to the 8, of the aggregate, which sufficiently 
indicates the symmetry of the distribution, we have (a) the experience that the 
distribution of stature is very approximately normal for a given age, and (6) the 
evidence that it is so from the combined samples # and F. The normality of the 
parent distribution being assumed, we can proceed to test the hypothesis of the 
equality of the variances for the two schools. The requisite formula is deducible 
from (xxxi), if we suppose the means of the two samples to be unequal. We have 

me ae {3 ” S, : ( : n (uo! — #2) = .)} 
My’ — pe 5 (m-2) \ uo! + po + 3 (F,’ — Z,)*/) 


* “A Statistical Study of Oral Temperatures,” by M. H. Williams, M.B., Julia Bell, M.A. and Karl 
Pearson. Studies in National Deterioration, Drapers’ Company Research Memoirs, No. IX, Table LXII, 
p- 109, Cambridge University Press. 





Totals | 
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Now n=16, po — pe=3°285,157, 
pe + we = 11°298,177, 
Ly’ — LZ, = 6875. 
Hence re '_ne™ 2 {5 — S,(455698)} 
= *228,37 by Table L. 
It is thus quite possible that y2’ and 2 would be equal, if the samples were 
indefinitely increased. 
Let us consider what would happen if we took instead of the variance of the 
combined samples the variance M, of the aggregate of the Worcestershire schools, 
i.e. 6°539,012. In this case the argument of the S,, function is 


Nps — pe 8 x 3°285,157 


ae ay — ey GES TA 
> ts sions 
and Pg = 2 [5 — S, (4:019,148)} 
By — Be , 
= 285,09. 


This makes the equality of the variance in different schools somewhat more 
probable, but still of much the same order, while if we take the aggregate less 
Schools # and F, we shall get an intermediate value. This example suggests that 
it will be adequate in many cases to use the variance of the combined samples in 
place of the usually unknown variance of the aggregate. 

The appropriate equation to use when the samples are of the same size for the 
ratio is (xliil), or since pe’/we = 1°819,955, we have 


Q = 21 


[ity 3546 ( 


75, 7°d) 
= *258,01. 
The difference test is here slightly more stringent than the ratio test, but 
neither is incompatible with the hypothesis that the standard deviations may be 
the same in the two schools. The reader must be cautious in applying Fig. 3 to 
such a case; he must not use pe’ and pe as corresponding to x” and x? and deter- 
mine the point (2’, ze) on the diagram. If he did so, he would find that point well 
outside the v= 16 curve and so conclude that the ratio test was the more stringent. 
Equations (xxix) to (xxxi) indicate that the correspondence is between x and 
npg’ | Mz and y* and npe/ Mg, or in our particular illustration, the values 
16 x 7:°291,667 16 x 4006,510 


5°767,2526 5°767,2526 


and or 20°229 and 11°115. 

Looking this point out on the diagram we find it just inside the v=16 curve, showing 
that the tests will give approximately equal probabilities, but the difference test 
the smaller one. A like procedure must be adopted in testing on Fig. 3 the 
relative stringency of other applications of the two methods, i.e. we must inquire 
from the argument of the S,, function what values correspond to y and x*. 
Common factors disappearing in the ratio are apt to mislead, when we apply the 
difference test. 
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It will be found generally advisable to use (xxxi) instead of (xxx), unless a 
preliminary inquiry has settled whether Z,’ and %, may be considered equal. The 
tests for this may be practically treated as twofold. 


(a) If % and % be the means of two samples of the same size taken from a 
normal population of standard deviation =, and the samples be perfectly independent, 
then %—Z, will be distributed normally with standard deviation (23?/n)t. 
Accordingly if we form the ratio 


we can test for its probability by the integral of the normal curve. When we do 
not know the parent population, the best value to give to = appears to be that of 
the combined samples. But here a question arises: Should >? be taken equal to 
1 (ue’ + ue), Which it would be if actually Z = %,—our hypothesis—or to 
2 (Me + pw y v2 y} 
 (m2’ + we) + 3 (He — TY, 
which is the observed value? We incline to the latter alternative. Accordingly 
T.—T) Vn ‘6875 x 4 
=~ a i. =: . = — = 8097 
V ue + pet} (%—-—H%P V11534,505 
in our present case. 
The probability of a ratio as large as or larger than this is ‘20906, and taking 
y o 5 o 
the possibility of either sign for 7,—%,, we have *4181 for the chance of a 
deviation of this size. The test therefore indicates that it would be legitimate to 
consider the difference between %2 and Z, as due to random sampling. 
If we take for 2? the value 6°539,012 of the aggregate, which would render 
our theory more accurate*, we find the probability = 4470, and although this 
differs to some extent from *4181, it leads to precisely the same conclusion, 


(b) We may adopt “Student’s z test.” Here the same assumption of 
normality for the parent population is made, but we divide 7, —Z, by the observed 
standard deviation of the difference = V(2+ 1). Thus in our case 

Pi 
V11:298,177 

from which we obtain the chance of the mean difference of the samples lying out- 
side +6875 to be ‘4406, a value lying between the two values deduced from 
method (a), and thus confirming the result that the schools very probably have the 
same mean stature for boys of 12. 

But to assume this makes no sensible difference in the conclusions already drawn 
with regard to the variances of the two samples. 


Illustration (viii). The following data for pulse rate and oral temperature are 
taken from the memoir referred to in the last Illustration, Table XI, p. 84. 


* Not absolutely so, because the theory assumes that we sample from an indefinitely large popula- 
tion, and that this population is strictly normal. 
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TABLE viii. 
Pulse Rate and Body Temperature in Children. 


Pulse Rate 











| | } | | 
7 nd | - a 4d a > . . ad wel ~ ~ y vee | 
Body | 48 50 | 52 54 56 | 58 | 60 | 62 64 66 | 68 10 72 | 74 46 Total 
¥ | to | ey | Bel Eel eel Tae ae Tae cee oe di (ea Oe otals | 
Temperature | 49 | 51 8 | 55 | 57 99 | G1 | OF | 65 | 67 | G9 | 71 | 73 | 75 | 77 r 
| 
Z 
A, 98°4°... ~{|—] 3] 7 6] Sy 3 | — - 22 
a ee oo Eee a ee Se 2{—|—|—|—j-] 2 
Ame Sc tah BS a BLE I 1 ee 
All Temperatures | 3 | 33 | 91 | 189 | 272} 134) 88 | 34/12] 7} 2); 2)]—j1 1 869 | 
| | | | | 





Before we discuss our samples let us ascertain something about the parent 
population, of which of course we might have no knowledge, or we might not 
know that pulse rate curves in children are very skew. We find that the following 
are the chief constants of the distribution: 

Mean = 56°8665, Variance = Mz = 10°732,853, 


4 


Standard Deviation = & = 3:276,103, 
8, = ‘971,479, B2= 6:159,378, 
the distribution curve is therefore decidedly skew and also markedly leptokurtic. 


The distribution of means would have for its constants in the case of samples 
of 7 


+ 23 a is M. Pe 
Mean = 56°8665, Variance = —? = °487,857, 
n 


By - a a 
By =z = 044.1 od, By = 3 + B = 3°143,608. 
zt nt 

This is not very widely removed from a normal distribution, and we might well 
conclude that a normal distribution for the means, as we do not know the true 
curve, would be sufficient in this case. An examination of the chart of the 8, 8: plane * 
indicates that the corresponding Pearson curve would be Type IV, but that we are 
so close to the Gaussian point, that the normal curve would be likely to give us 
quite reasonable results. Hence we are thrown back on (a) or (b) of the last 
Illustration. We have 

: ea ; » wae ava ./ 

(F—%)Vn  5°727,2727 V11 


3 25 &£«32 


= §79B1O7TE  ........0008 (liv). 


This corresponds to a probability of about ‘000,000,007 of such a difference 
occurring, if we based our investigation on a knowledge of the parent population. 
It means that samples confined to temperatures 98°4° cannot be considered as 
randomly chosen or that there is, which we know there to be, a correlation between 
pulse rate and body temperature. But what are we to do if we do not know the 


* See p. 66, Tables for Statisticians, Part I, 
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actual parent population, but suppose it for good reason to be skew? We may put 


our data as follows: 
Pulse-Rate 





Group | Mean | Variance | B, | 

| uF insate Bee oat 

ne ee oe ‘ = Ss = 2 | a Te? a ee 

| A. Body Temperature 98'4°... | 54°454,546 | 4°460,055 | 024,044 | 

| B. Body Temperature 99°4°... 60°181,818 8°997,245 *343,350 

| A and Bcombined ... .. | 57°318,182 14°929,063 | °237,756 
Aggregate of all Temperatures | 56°866,513 | 10°732,853 | -971,479 





Now assuming that pulse rates can be approximately given by a Type III curve, 
we turn to equations (xxiii) and (xxiv). Considering first the argument of the 
T,,, function, we see that the first value of Y in (xxiv) would be appropriate if we 
knew the mean and modal pulse rates, but the latter involves the determination of 
the mode. Now we can determine these quantities from Equation (viii). We have 
p= 2. 1 = 4:117,433 in the present case, 7* =a (1 + -) and a= —22 ; hence 

Pr Pp Vp +1 
f=oVl1 +p. Thus, for the present case, 

4°117,433 x 3°276,103 
V5°117,4383 


Mean — mode =%—a= = 1-448,210, 
} 





a = 5'962,908. 


and %, modal pulse rate, accordingly, is given by 
#% = 55°418,308. 
nz * (Fy — Zn) nvil+ pF," — Zy) 


Now Y = ———_.— 


cz ox : 
and for our particular case 
Y = 87:003,946. 
Again, the curve being given by 
z=4NT,(p+1)-34(Y) 
and n(p +1)—4 = 112'083,526, we have, for the distribution of Y, 
2 = EN Thio0835 (Y), 
and we need the area beyond 87:003,946. 


But neither 7',,(Y) nor S,, (Y) is tabled to such order of the function or to such 
an ordinate. We turn therefore to p. 297, where we see that after m =11°5 the 
T\»(¥) function coincides with Pearson’s Type VII curve. In order to obtain this 
Type VII curve, we must write, for the p+} in (xviii), p. 297, n(p+1)—}4 of the 
present notation. It then transforms to 


] 


Zo~ ye nipriyeg oct (lv). 





| + 
‘ 4a (p +1) \n(p+1)+1f/ 


“is measured from start of the pulse rate curve, whereas Z,,’ and %,, are absolute values of the means. 
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In our particular case n(p + 1) + 2°5 = 115°083,526, but for such values a Type VII 
curve is for all practical purposes a normal curve, i.e. 
_ _¥2 {n(p+1)+2-5} _ 

4n(p+1) {n(p+1)+1} 


Z= %e 


Substituting the value of Y, this becomes 


’ 


n (Z,' —Z,)" 2 n(p+1)+2°5 
2>2 n(p+1)+1 


bo 


Z= He 


but the latter factor in the exponent is unity to the same degree of accuracy as we 
have used in passing from the Type VII to the normal curve. 


Thus our Bessel function 7’, (Y) curve has reduced to the normal curve 


2 = Ze = Ze 


precisely the curve from which we obtained our first result that the two means were 
not compatible with being random samples from the pulse rate population. 


It may seem a misfortune that the example chosen does not fall within the 
range of Table I. We will accordingly try if we have any better luck with the 
ratio method. This is provided by Equation (xlii). But about this equation an im- 
portant point must be borne in mind, 2,’ and %, are not the absolute means, but 
those means measured from the start of the curve. We must therefore subtract a@ or 
5°962,908 from % the modal value or 55°418,303, to get the start of the curve which 


is accordingly at 49°455,398 pulse rate*. Thus we have 
#,’ = 60°181,818 — 49°455,398 = 10°726,420 
and EZ, =54454,546 — 49°455,398 = 4999148, 


mS 


giving the ratio %,'/%,, = 2°145,650, or 


()- En = 27 s1790 ( 112°58% 026, 112°583,526). 


Cx 
Again we find no such high values of the incomplete B-function have been tabled. 
Writing m for 112°583,526, we have 


*+31790 


| am 1 (1 - x)" dx 





rl 
p21 (1 — 2)" dx 
Jo : 
Put «=}-— 2’ and the transformation gives us 
5 
(2 — 2'2)"da" 
> . _ 9 4-268 
Qin lin = © +5 ? 
2. 
(4 —2”)" da’ 
J—5 
It must be remarked that in this case as in others the ratio as well as the difference test forces us 
to make appeal by way of knowledge or of hypothesis to the real or supposed parent population. See 
further on this point p. 328, ftn. 


21—2 
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But m is so large that we can safely replace our curve by the normal curve, and thus 


etme’? dz’ 
J 1821 
~ 


+5 maps 
| eae’? da’ 


wail 


Qen'/ In 


Write 4max” = 4&*, and we have 


Qs 3, =2 | 


“ «/8m X*1821 


a Sm x °B 1 


ede. 


V QF 
Now V8m x °5 = 14°93877 and may be replaced for this integral by # and 


V8m x 1821 = 5°4407 ; 


n 54 


= | ~ere 
thus Qs 5 =2/ ——e * d..., 
nl 54407 V Qar 

or, using the probability integral table of the normal curve, = 000,000,05. Thus we 
see that on the ratio hypothesis the randomness of the two samples is only slightly 
less improbable than on the difference test. Both involve a knowledge of the pulse 
rate distribution. It may be asked, what can we learn without this knowledge? The 
reply must be that we can only work with the combined A and B as representing 
to some extent *(!) the general pulse rate distribution. This gives 


By ='237,7557, oo, =3'863,8146, p+ 1 = 16823,9941 


, Mm &,' —F ee 
and Y= ” = 133°757,708, 
\ Bx Ox 


m=n(p+1)—4=365°627,871, 


or z= SN 7's65.627, 71 ( 133°757,708), 
a value still farther removed than before from the range of m and Y in the 7’, and 
8,, tables, and still more appropriate to the application of a normal curve. The 
* In many cases probably we could appeal to experience as to what sort of value {, is likely to take; 
we know its value for many types of variate, weight, pulse rate, barometer heights, etc. If we do not have 
any suggestion from experience, the only value we can take is that obtained from the combined samples, 
even if it appears ludicrous to find 8, on forty cases. But is it after all very much more absurd than 
finding a variance on twenty cases, which the problem also requires? Again, it may be said that the difference 
test makes more appeal to or hypotheses about the supposed or known parent population than the ratio 
test does. But the reader must not forget that the whole theory of x’? in relation to x? is based on the 
assumption that the relative frequencies of the parent population may be replaced by the relative 
frequencies of the combined samples, This is clear enough if we approach x? as Pearson has done in 
Biometrika, Vol. vit. p. 252. 


If F’,/M be the relation frequency of the sth category of the parent population M, then the true x? is 


given by 
le (f fe \? [Ps 
F(N+N’)? \N N’ M’ 
F, fatSe "et - : 
~ s 7 i , i.e. the relative frequency of the combined samples, that we obtain 


and it is not till we put 


the value given by various writers for y*. In other words, the very basis of the x? method is an appeal 


to the combined sample relative frequencies as a representation of an infinite parent population. The 
weakness of this when the samples may consist of 10 to 20 individuals is only too obvious. 
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normal curve in question will be that given by (lvi). We must now write for { in 
(liv) oz for =, or 3°863,8146 for 3°276,103, and we obtain 
¢= 4-916,178, 
giving a probability 
P_,__ =-000,000,88, 


n 
not nearly so stringent as the result when we know the parent population, but amply 
sufficient to indicate that there exists a real difference between the mean pulse 
rates at temperatures 98°4° and 99°4°. 


We have based the above investigation on the assumption that the parent 
population was a Type III curve with a considerable skewness, but we reach the 
important conclusion that, even with a §; of order 1-0, it will be adequate to apply 
the normal curve as describing the distribution of samples; with smaller samples 
and still greater skewness in the parent population, i.e. p smaller, the m= n(p+1)—1 
may be small enough to come into the range of our S,, table, or to bring 


I, {n(p +1), n(p+1)} 
into the range, (p+ 1) =450, of the B-function tables. 


Unfortunately when we turn to test the standard deviations of the pulse rates 
of temperatures 98°4° and 99°4° we are somewhat ata loss for a method, for, as far as 
we are aware, no one has so far found a curve giving the distribution of either the 
standard deviations or the variances of samples from a Type III curve. All we can 
provide at present is a curve having the same first four moments as the required 
curve. To do so would lead us somewhat away from the main topic of the present 
paper. We can, however, give another illustration of Equation (xxxi) if we assume 
that we shall not go far wrong by using the Type III distribution of pe to apply 
approximately to this case. 


If we confine our attention to the two samples and their combination, we easily 
find from the table on p. 326, that 


a - — fe ¢ 4-537 ,1900 ss 
re. =. Ha) ° BP t. a 4 _ 9.943 0895 
4 (ue + Ma)+ 4(%a —Ly y° 14-929 063 
and m=4}(v—2)=}(u—2)=10. 
Thus Rak: an = 2 {C5 — Sto (3°343,0825)} 


= °*4532 from Table I. 

If we had used the variance of the total temperature curve, Le. 10°732,853, 
we should have found fe 4. = 3000. 

Both indicate that it is not unreasonable on our hypothesis to suppose the 
variance of the two samples the same. Turning now to the ratio test, we have, since 
fe’ r= 2:017 2946, 
='1159 by Table II. 


* uo'/ Me 
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This is not entirely opposed to the equality of the two standard deviations, but 
gives that equality only a fourth of the probability. Thus for the first time in these 
illustrations the ratio gives the more stringent test. Let us see if this would & priori 
have been indicated by Fig. 3, p. 308; we must take 





, 
for y"*: , and for x: i, 
where Mz=4(pe' +e) + £ (%’ — FZ)? = 14°929,063, or 66293 and 3:2862 are the 
required values. The diagram indicates that this point is well outside the curve 
v= 22, and thus the ratio will give the more stringent test. 


We have for the purpose of illustration used throughout both tests, but Fig. 3, 
p. 308, will always enable the investigator to choose & priori the test which provides 
the lesser probability. 


(16) We next turn to the important problem of determining from the mean 
square contingencies whether they may be considered as samples from a common 
parent population. Now if we have a « x A contingency table, where « > 2, the mean 
square contingency ¢,” must lie between 0 and A—1. 

Thus as far as a Pearson curve may be considered applicable—this is our first 
condition—it must theoretically be of the limited range type or of form 


Y = Yo(dr”)”1 (p — gi®)?2 CF Ee Seer (Ivii), 
where p=X—1. In the case of such a curve of known range the values of p; and 
pe can be determined in terms of the mean ¢,; and of the variance. Let these be 
g:° and o*2, then we have 


et ee 
p+i=© (* p *) _1) | 


o*4.2 
Py zs 
pet1l= (a - *) (ee Or) _ 1) | 
P o'¢;? 
Now when the size of the sample is fairly large, the variance of ¢,* will be a small 
quantity compared with the product of the two segments of the range as divided 
by the mean $13, i.e. o:?(p — 12). 

The most unfavourable case for the largeness of either p; or pg occurs when they 
are nearly equal and WN the size of the sample is small. For example, if ¢;2=°5 and 
o*s,2 is of the order ‘01 and «= =2, then p;=p2=11. (Ivii) becomes in this case 
a Type II curve and the distribution of means of samples from such a curve has 
not yet been solved in any practically useful manner. But the 82 for this case is 
about 2°8 and we should not err greatly by treating the distribution as practically 
normal*, and then the distribution of the means of $,? would also be normal. 

Professor Kondo has made a second experiment for a 3 x 3 table with WN = 250 
(loc. cit. in footnote below, pp. 441, 419—420). The number of cells is here again 


* Professor Kondo (Biometrika, Vol. xx1. pp. 416—418) has dealt with a case of this kind. He has 
dealt with the observed mean and variance of ¢) in 804 samples of 100, from a parent population of 
$;°="5; here the observed value of ¢,2="499,8005 rendered the curve slightly skew. 
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very limited, but any one who has endeavoured to take several hundred samples of 
contingency for 2 x 2 tables will appreciate the amount of labour involved. In this 
second experiment p2 is large, and it would be adequate practically to replace 
(lvii) by 

Pe 

= = rk : 
y=yo (ore ! PeSeER esate eect ae esta (lix). 
If p, were the larger, we should have to measure our variate ¢;? from the other end 
of the range and it would be the term with power p; which would be replaced by 
the exponential. If therefore either p; or pz be large, we can write our distribution 

of o: 
ion te ee 4e¢1° ne 

pe hy Cte | cnasandchewdisteencenvel (lix bis), 

P 


pA - “+. - - - - 
where Je =~, and p; and py» are given by (Iviii). In this case the distribution of 


$i — ¢,;*, if the samples are of the same size and the same number of cells, is 
given by 


2=4M rh, + Leet HG nase cccucccccenuaneenad (Ix), 
and the probability of a difference as great or greater is given by 
P 4 r2jg,2 = 1-28, 4 tbe (hx — r°)] sadn Se eeweelaiee (Ixi). 
The corresponding probability of the ratio }17/¢:* is given by 
O42 oe 2! 1 S35 Sy PE A) se ecenstescuncuuen (ixii). 
1+¢$,"/¢,° 


Thus the difference test involves the determination of one more constant pg than 
the ratio test with p; only. But this does not much increase the labour, as p, already 
involves a knowledge of both ¢;? and o75,2, w hich are the laborious quautities to 
determine. The values of these quantities to a second approximation are given in 
the memoir by Professor Kondo, already referred to. We will now discuss special cases. 

(17) The problem to be discussed in this section is to find the probability that 
two values ¢;” and ¢,” of mean square contingency could have been obtained for 
samples of the same size, NV, the same number of cells, « x A, with a parent popula- 


tion of zero contingency. 


In this case we know that approximately * 


o> L/ 1 = 
gi? =(« -1)(A-1) 4 (1+ wr) sduhabe Wudeddewkcebessbenteraeece (1xiii), 
. 2 & (« —1)(A—1) ee a 
oy2= wilt 1a- va i ica” jcee (1 + sy) f ...(Ixiv). 


We might compute these for any given case N, «, A, and then substitute in 
(Iviii). But as the approximation is only of the second order, we can save workers 
some trouble by making the substitution, algebraically neglecting in the process 
terms which we are not warranted in retaining. 


* Kondo, loc. cit. p. 408. 











332 Further Applications in Statistics of T(x) Bessel Function 


Remembering that p=2-—1, if \ <x, and calling F the factor common to 7, 
and pe, we have 


. 1)(a-1 
F= $2 oe oi?) _ 1=4N Q-1{I + xt —S wee w+n ...(Ixv), 
where y=(e-1)-1)—(e-2)-—>. 
Now 
«-1)(A-1 
mtinOxP jientietio y+ yt ye + | 
4(«-—1)(A-1) ) {+ a ES a a A | ..(Ixvi). | 


For the extreme case of No, we see that p+ 1—~4(«—1)(A— 1), or we 

have for the ratio test 
Qero2=22 1 {F(e—-1)(A-1), 4 («-LA-D)}. 
1+¢,7/¢° 
For the x? test we found | 
Qeawe=22 1 {e(v—1),$(-D)}, 
1+x7/x* 

so that we cannot pass to the ¢,” test by writing merely y=«d. This would only 
be true if « and 2 were very large indeed, which would be very rare in practical 
work. For the like reason p; + $ can only be taken as $(v— 2), where v= «A, when 
not only N is very large but (« +X — 2)/(«X) isa very small fraction. For example, if 
«== 10, or a contingency table of 100 cells, which is very unusual, we can hardly 
assert that 18/100 is a very small fraction. 


Next turning to pz. we have 





a =1NQ- f ro “ ( xt ( («-1)QA- 1) ie 
po +1 N(x ad ne 1+3+ We? (+) 
} 
a ee ~1){r7.—2)(y Lh f 
= Wa-(1+” c-9 Mh im wets 2) — at) ee sy (Ixvii). 


Illustration (ix). Suppose ex =X =5 and N= 400, what is the probability that 
$1" = 02 and $,'"*= ‘07 could both be drawn as samples of V from a population of 
zero contingency ? 


Here we find from (lxiii) and (Ixiv) that go. = 040,100, and o*,,2 = 0001,9199, 
while p=A—1=4. 


We may now proceed to find p; and p, either from (lviii), or from (Ixvi) and 
(Ixvii). We cannot say that (Iviii) is more accurate than (lxvi) and (Ixvii), because 
in (lviii) we retain terms which are of an order we have neglected in (Ixiii) and 
Ixiv), and which ought to be neglected because they have been neglected in passing 
from (lvii) to (lix). Working with (lviii) we find first from (Ixiii) and (Ixiv), 


1 = 040,100, o*,,2 = 0001,9199, 





: 
| 
| 
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whence the common factor F= 826°0847. Further $,?/p = 010,025, and 


1- = ‘989,975, 
leading by (lviii) to . 
pi=72815 and pe=816°7803. 

Working with (Ixii) we have at once 

v=16—3-—4=125, 
whence F=800 (1 + 03125 + 0013) = 826-0400. 
Thus gi =72811 and pe= 816-7589. 

We see therefore that the two methods accord well, and further that pz is so 
large that (lix bis) will adequately express the distribution, where $¢ = 204°1897. 

Finally, by (Ixi), the probability of the occurrence of the difference 

dx? — bx? = 05 
is given by P4,2-6,2 = 1 — 282.7815 (10°2095). 

And again Q5,7/6,2 = 21222, 222 (8°2815, 8°2815). 

Both involve a twofold interpolation with regard first to the arguments, and 
secondly to the orders of the functions. For the purpose of appreciating the pro- 
babilities the hyperbolic formula* will suffice. From Table I we find 

So-ra1s (10°2095) = -4920,0818 
01598. 


and thus P5,7-62 
Again from Table II we have 
Tso 299 (8'2815, 8°2815) = 008,324,230, 
and accordingly Q6,2/6,2 = 01665. 

Such values would only lead us to say that it is not very probable that ¢; and 
¢;* were samples from a population having zero contingency. Here, as so often, 
the difference test is, if only slightly, still more stringent than the ratio test. 

In order to determine this & priori from our diagram, Fig. 3, p. 308, we must first 
find v from the relation }(v — 2) = 7-7815, or v= 17-563. The curve corresponding 
to this is nearly mid-way between the curves for v=17 and 18. Corresponding to 
x* and y” we have eg? and eg”, or 8°1676 and 285866. This point is just inside 
the v=18 curve and so clearly within the v= 17-563 curve; thus the difference test 
is the more stringent. 


(18) The previous section indicates that the solution of the problem of two 
mean square contingencies arising as samples from the same population is relatively 
easy, if that population has zero contingency. This follows from the fact that the 
approximate expressions for ¢;° and o%, are known and relatively simple. In the 
case where the contingency is not zero the problem is much harder, as although 
approximate expressions are known for ¢:? and o*4,2 they are laborious to determine 
in particular cases. 

Vables for Statisticians and Biometricians, Part Il. Formula (a), p. xviii, i.e. 
20, = P¥=00 + OXZ 014 GE Mis saeuy vesseuevavinansdacaseesenneseed (Ixviii). 








334 Further Applications in Statistics of T(x) Bessel Function 


Illustration (x). We will first illustrate the method in a particular example 
provided by Kondo and then consider what is needed in order that the values of 
¢:" and o*s,2 might be obtained more readily. 


Kondo took 250 samples of size 200 from an infinite population of a 3 x 3 table 
with the following proportional frequencies* : 








| | | | 
| 0831 0786 =| =~ -0270 1887 | 
*1032 *2864 “0862 "4758 | 
0335 «=| +1935 1785 3355 
j | | 
j | 
2198 | 4885 2917 1-0000 
| 





From this table Kondo computes 
$i? = 206,503, 04,2 = -0043,3796. 
Clearly p= — 1=2, and thus 
$:°/p = 103,2515 and p— = 1°793497. 
Using Equations (lviii) we find 
p= 7°712,064, p2 = 74°665,056, 
only differing slightly from Kondo’s values+. pg is considerable, and we take it that 
the distribution curve of ¢;? may be reasonably represented by (lixbis), or 
y= Yo 4 (74°665,056 oi" )7-712,064 e732 (74-665,056) 9,2, 
Accordingly the chance of a difference as great as or greater than ¢;— ¢;° is 
Ps, 2¢2= 1— 2S3.212,064 {3 74°665,056 (gy - o;°)}, 
and the chance of a ratio at least as great as ¢,'*/d;? is 


Q¢,%/6,2 = a. 5, ; _(8°712,064, 8°712,064). 
FP “/ Py 


Now Kondo has given the 250 values of ¢,? which he obtained for his samples; 
the largest of these is 370,502 and the least 054,201}. Let us inquire what is the 
chance of such a difference occurring in samples of 200 taken from the table above. 

The difference ¢,’* — g,* = ‘316,301 and the ratio $,'*/¢,7 = 6°835,704; hence we 
require 

Ss.212,064 ( 11 808,31 6), 
and T 427,621 (8°7 12,064, 8°712,0€. }. 
Using the hyperbolic interpolation formula (Ixviii), we find 
Ss-212, 064 (11°808,316) = -496,3113, 
and I 497,621 (8°7 12,064, 8°712,064) = 000,111,08. 
Accordingly P 4,26 = 007,3774 and Q4,2/¢,2="000,222,16. 


* Biometrika, Vol. xxt. p. 411. + Biometrika, Vol. xxt. p. 419. t Ibid. p. 412. 


+ 
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Referring to Fig. 3, p. 308, we have to enter with ed; = 27°66, e¢i27=405 and 
v = 18-424, or since this point lies well below even the v=18 curve the ratio test 
is here the more stringent. With 250 samples of 200 we should expect by the 
difference test 1-8 cases with a ¢;° difference as great as or greater than 316. Kondo 
experimentally found one, but a second ‘313 runs it close. By the ratio test we 
should expect only ‘0555, say 06 of a case in 250 samples. Thus while the ratio test 
is the more stringent in this particular case, the difference test accords better with 
Kondo’s experience, and would determine more accurately the range of ¢;? in such 
an experiment. It may be remarked that neither Kondo’s maximum nor his minimum 
¢;°'s are outlying values; they run: 

At top 370,502, 367,532, -362,669, -343,805, 

and at bottom 054,201, 063,055, ‘080,840, -098,408. 
Thus the two methods could not be brought more into accordance by the omission 
of an exceptional outlyer, Assuming the arithmetic to be correct, and it has been 
carefully checked, this case seems to be of importance as an indication of the value 
of the difference test when subjected to experimental verification. 

The above illustration shows that there is no difticulty in applying the difference 
test to two values of ¢,*, if the parent population with a definite contingency be 
supposed known and the two samples have the same size and the like cell distribution, 
provided $,7 and o%s,2 can be determined. When we have no real or hypothetical 
parent population, our only method is to suppose the parent population to be 
obtained from the combined samples which are used to give the relative frequencies. 


(19) The problem remains as to what can be done to simplify the labour of 


obtaining ¢,? and o*s,2 in the case of contingency tables with a reasonably large size 


of sample. Now Professor Kondo has shown* that, to a second approximation, 


$2=G2+ V4 ) 


' ae on 
A ff <a ayessanndeeeeienernnel (Ixix), 
See ae 1 2 
oe? ="y tN? (Vy ° ¥) 


where Wi, We, fi. fz are functions of the relative frequencies of the parent population 
—i.e. of its cell frequencies on the basis of a total frequency of unity t—and ox is 
the mean squared contingency of the same population f. 

Now if we substitute these values in (Iviii), the expressions for p; and p2 become 
very complicated, even if we only retain the first two terms in the expansion. We 
get simple results if we retain only the leading term. In this case 

yp am Ga (0 — x’) 1 ) 

psi 4 

_Nbt@-5_ | 
L4 pn 


scuebaeopenpeeeeeneet re (is oi 
pe 


Biometrika, Vol. xxt. p. 413. 
They are in fact the chances that an individual will be drawn from each particular cell. 


@)2 is of course the mean of the mean squared contingency of samples and only approaches ¢," 


++ + 


as N is increased. 
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These expressions do not contain Wi, We or fz, which would not therefore require 
calculation. How far can (Ixx) be used instead of (lviii) and the fuller but still 
incomplete results (lxix)? This cannot be determined till far more experimental 
work has been undertaken on the two sets of approximations. We may place here the 
formulae from which ¢;? and f; are to be determined. Let ¢,. be the chance that an 
individual will be drawn from the pth row gth column, c,. the chance that it may 
be drawn from anywhere in the pth row and ¢,, from anywhere in the gth column. 
Let Cyq = Cpq/(Cp. €..) and Cy, = S Coy i.e. the sum of the C,, for the pth row; in the 
on 


Wak 
same manner let C,= S C,, or be the sum of C,,, for the gth column. Then 
i 1 1 Pq 
p 


Ig 
et dE ociisvsengscwssadtonapcinsicnds (1xxi), 
where > denotes a summation for the whole tables, and 
3 Cc. p=« /(2, a=d (C2, Gee. = 
fi=43( vt) 3 S (—*)-3 S (- t) 42% ot, te ae (1xxi1). 
Coq p=1 Cp. q=1 C “q Coq 


A sample of the actual working required to obtain an /, is provided on the following 
page, it is for the table of Kondo’s on p. 334 of this paper. It is considerable, but 
not prohibitive for a table of 3x3 cells. We can now illustrate the approximate 
formulae (Ilxx) on this case where 

ox? ='188,8925, p=2, f;='804,0440. 


These give us py=7'0370 and ps = 76°0590, whereas when we use (lviii) with the 
fuller values given by Kondo for ¢," and 74,2 we have py = 7°7121 and po= 746651 *. 


We shall have } 


}¢ = 380295, and accordingly 
Po, 26° = 1 — 283.5370 (12°0288). 
From Table I we find 
S.5970 ( 1 20288) = *4971,7440 
and accordingly P¢,2-¢,2 = 00565. 


Thus in 250 samples we might expect 1:4 occasions on which a difference as 
great as or greater than that observed would arise. Accordingly we have actually 
approached nearer to the experimental result by using the less approximate values 
of $17, and o%4,2. For such a table as we are dealing with, it is clear that Equations 
(Ixx) will amply suffice for practical purposes. 


(20) We can still further extend the usefulness of our 7, (Y) function and its 
probability integral to many cases where we wish to investigate the difference 
between two squared correlation ratios or two squared multiple correlation coeffi- 
cients. If these be »? and R?, those quantities can only vary between 0 and 1, and 
an appropriate curve for their distributions will be 

Y = Yo (n*) "1 (1 — 9?) 2 ) 


ro oa = Bee (Ixxili). 


* Kondo gives /, = ‘801,842, but we have failed to finda slip in our arithmetic. This leads to p;) =7°059, 
Po=76°271 and Py _ $= 00490. 


| 
| 
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Actual Working of an /,. 





Ci3= “0270 | 


337 





j 
| 
| 
| 
| 
‘ 
| 














Linen Ls Z = |— 
| 
Cig 0069,0561 -0061,7796 °0007,2900 C). =S (C\,) =°246,7601 
C4. Cog -0414,7626 0921,7995 0550,4379 C2,.=0608,9055 
| ee a4 _— Ine a 2c 2 
| Cr 166,4955 067,0206 = ¢ I _ 399-6844 
¢ “la “027 7,2075 *0044,9176 °0001,7540 Ci. | 
| 2 = 2999 roOFs td - ” > ae. Y2 | 
| Ctl aanyseo pan woesses |g, (Zu) — soz 8264 | 
Ciq/Cig | 2°003,5560 "852.6794 "490.5185 Cig 
| | | 
| ey, ="1082 C= "2864 | c= 0862 | ¢y.= "4758 
eee et ee oe es : a 
Co, 0106,5024 | -0820,2496 | -0074,3044 | C,.=S(C,,)="508,2788 
C2. Cog *1045,8084 °2324,2830 *1387,9086 C?,, = °2583,4734 
4 *101,837 352,90 °053,5370 C*,. a, 
Cou ee : a A = ~ 4 = = 542.9747 
C24 0103, 7086 *1245,4152 °0028,6621 C2 
C*54/Caq "100,4928 4348517 °033,2507 Pa C2, ) — -568,5952 
Coq/Caq *986,7965 1°232,2081 *621,0789 ee : Fa 
| €31 = "0335 C32 = 1235 | C33 = 1785 | C3.=°3355 
j 
| 
C59 -0011,2225 °0152,5225 0318,6225 C..=S (C "433,8536 
C3. Cog *0737,4290 *1638,9175 °0978,6535 O2.. ="1882,2890 
Co °015,2184 093,0629 °325,5723 C2. 
ey a rea * — -561.0401 
C34 *0002,3160 -0086,6070 -1059.9732 Cae 
9 . iG 


006,9134 


"454, 2806 





‘0701271 


*753,5457 


1°823,9345 





2) —-$70,8631 | 
Ce 





Cor C., = *2198 


CG. 283.5513 


‘0804,0134 


O"-4 365.7932 | 


=3 CY \ | p=3 C 
y val | ¥ pl 
S { 7 C,. ) | S (; 
p=1 \@pg p=1 \©pl 
1°1930,5669 


Same x ¢ og *338,2928 


Finally, by (Ixxii), 


Ji 


ll 


fl 


804,0440. 


*2631,5659 


538, 7034 


= 1'1636,4101 


*596,9338 


*1539,4111 


527,7378 


1-2280,4222 


481,8264 


c..=1-0000 | 
{= (Cyg)=1°188,8925 { 
(thus ?- "188,8925 


(‘= 
|S (- a) 1-426,6992 
C2.) 
| s, (—")=1-432,2344 
L \ Gg 
> C*n9 a ¥ y 
= > J= Si. +82. +83. 


= 1°636,6847 


=1°417,0530 





4 x 1°636,6847 — 3 x 1:426,6992 —3 x 1:432,2344 + 2 x 1:417,0530 
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Curves of this form are known to be accurate when we are sampling from a popula- 
tion, wherein there is no correlation of character and the distribution is of normal type. 


Equations (viii) to find p, and pe will then apply and we may write them 


. a a) oe 
pt+l=7 (* us 7 1), ~ati=a-9 (T—” -1), 


o",? o",? 
— /R2(] — R2 _. -Rpary 
nt+1= Fe(“ (1— Rl) _ 1), pot 1=(1— R%) (= a = 1) ...(Ixxiv). 


o* 8 
The values of p; and pz can be determined if the mean and variance of 7? or R? be 
known. 
If either p; or pe be large, say p2, we are thrown back on the Type III curve 
y = yo (4 2pe ?)P1 eB?" 

oie y= Yo ee Qe R2)r1 e-Brz ke | ITeTiT TT 
and accordingly the distribution of the difference of the * of two samples, or the 
R?, will be given by 


i MT, .1 {pz (mn? — »)} : 
y= ‘pa Pata 7 M ’ Bete (Ixxvi), 
or y =4 MT), +4 {po(R? — R)} } 
and Table I can be applied, or 
2=]-2 a 3 
~All Sp Halper 7} ue eckasbeneneaved (xxvii). 
and Ppe_p=1- 28.4 3 {pa( Rh 
It is needless to add that the ratio test can be also ed in such cases, or 
Q,? 2, 2= 2L 1 (pit+1, pit+ 1) 
dats a a Ee Er (Ixxviil). 


and Qey@=2l 1, (atl, mt) 
1+R7]R? 

Here as before we need apparently only p, to find Q, but we have actually to find 
p2, or we have no justification for replacing (xxiii) by (Ixxv) 

Illustration. In the special case of a normal population with uncorrelated 
characters, we know that 

pi=h(n—3), pomh(N—n—2) .....ccccccceceees (Ixxix), 

where for 7°, NV is the size of the sample and n the number of arrays on which 
n* is based, while for R?, NV is again the size of the sample and n= total number of 
variates considered, i.e. the dependent variate and n—1 variates with which it is 
multiply correlated. 

Now it is obvious that in a very large number of cases NV will be large, often 
very large as compared with n. In such cases (Ixxv) will apply, and we may write 


P,_42 = 1 — 283 ina [$ (W —n —2)(n? —1°)} } (Ixxx) 
Pp2_p2= 1-284 in» [4 (NW —n —2)(R— RJ =e 
and for the case of the ratio 
Q, 2/92 = 2L 1 i+ (n—1), te — 1) 
1+n2/92 
TEP ian Sadie nk come bs eae (Ixxx bis) 


Qrye=2l it (n—1), $(n- 1)}| 
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The length which this paper has already reached hinders our providing special 
numerical examples for this section, but they would only be similar in type to 
those of the earlier sections, and accordingly little harm will be done by their omis- 
sion. 


(21) A limitation which detracts somewhat from a full use of the present 
methods must have struck the reader. Namely, we may need to compare the constants 
of samples which are not of the same size. 


In this case our two original equations are of the form 





M tt M Tgt1 
Yu = - ited ue" and y,=5 > yr2e-%2", 
C(714+1) . I (72+ 1) 
and the combined frequency surface is given by 
yu yv Mryy%1* 1 yq%2*? Foc : : 
w= MZ x7 ele ty” wtiyt2 [du dv] ...(1xxxi). 


= — — - é 
MM Y(m+1)T(t2+1) 
Let us first approach this from the standpoint of the ratio of v to wu, and write z=v/u. 
Then (lxxxi) becomes 





yq Title. T2+1 
Myi"*h ya erty ynitatles [dudz]. 


"=Tiat+Dl (mt) 

Write &=(y1+ 22) u and we have 
— Mey" "24 
P(t, +1) (724+1) 


2 


e-t Ty +7gH1 — ssp eiine d dz ° 
E (ya + yaeyuPat [dé dz} 


] 





Keeping z constant integrate out for &, the limits of which will be from 0 to 
corresponding to u from 0 to 0. Thus we find 
M THY, tet1 gt. _ 
ee _ tli: — in nee ne en LC (xxx) 
B(t1+ 1, t2+ 1) (yi + ye ~)™ = 
for the distribution curves of the ratio. 
Now put 2’=2ye/y and integrate for the probability P’,, of a ratio greater 
than 2, 





“22° edz ie 
Pg gn : i Naeinbond (Ixxxiil). 
Ba +1, t2+ 1) J (ya + 22) 
+ ] v2 : 1 ] . 
Now put — =1+2, and hence dz =— ~ dy; further when z= 0, y=0, and 
y Y1 yy 
1 
when z=2, y= - Thus we reach 
1+ “oY2 
v1 
1 
29 Y2 
1 + 
sai : "hn yi(l—y)dy 


© B(1 +1, 7241) 
0 


=D (ta FD, Ta AD) coe cece eee eect tenes eee ees (Ixxxiv), 





120% 
N 


where J is the function tabled in the Incomplete B-function Table now at press. 
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Just as earlier (see p. 306) we should add to this the other equal wedge for 


: 1 : : : 
U/V > 2%, or z taking values from 0 to —. Calling this P”,, we have, from 
z ? ‘ 





me . 
(Ixxxiii), 
i 
pr a nth fo atads 
s, on . 
a (71 +1, 72 +1) |, ("1+ Y2 z)rittet2 
Now transform our integral by taking y= —— , or limits will be y=0 and 
i Y1 + 2 Y2 
y= , and the integral becomes 
1 a. Z0Y1 | 
ye 
1 
1 2o%V1 
[ae : + T; St 
70 B (1 + 1, 72+ 1) yy y2(1—y)idy 
0 
te oe a SD) er errr rent ree CSenihtne 6G MACS Sh (Ixxxv). 
1+ “0% 
Yo 
Combining (Ixxxiv) and (Ixxxv), we reach finally 
Q,, = 1 1 (7+ 1, te +1)4+T] 1 (re+- 1,41) «05082 (Ixxxvi), 
1 + “02 ha “971 
Yi Ye | 


a result involving a double entry into the B-function Table. 

Thus, if we use the ratio test, (Ixxxvi) shows that the determination of Q., 
invoives very little more trouble than in the cases we have already dealt with when 
¥1= y2 and 71 = Te. 

Illustration. Suppose we have two samples rendering variances with values 
o;" and a”, but that the size of one sample is m and of the other ng, and we wish 
to test whether they may be considered as drawn from the same parent population 


of variance >. Then, by (xxviii), 





‘ ‘ Ny Nig 
1=3(m—3), t=}3(m—-3) NW=553> Y= oye: 
Accordingly 
Qo.2/0,2 = 1 l 14 (m1 — 1), $(me—- 1) +] 1 id (m2— 1), (ma — 1) ...(Ixxxvii). 
1 ad a ‘3 om 
7,7N, o1Ny 


A question may arise as to whether the sum of the two J/-functions may not be 
greater than unity and so Q,,2/.,2>1 and impossible. Clearly the denominators of 
both B-function ratios are equal, since 
B 4 (nm — 1), + (Ne — 1)} = B ih (mg— 1) $ (my — 1)}. 

We can therefore write (Q,,%/¢,2 in the form 

ra pr’ 

at™%-3 (1 — 2-9 dx + ah s-3) (1 — 2h 3 dx 
» 0 “0 


*¢,*/0;" B {J (ny —_ 3). 1 (Ne —_ 3)} 
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1 1 
where = - and \’= —.: 

Gg Ne G2 hy 

Py = £t- 2 

Gy hy Gy Ne 


Write «=1-— 2’ in the second integral, and we have 


rd 1 
as (4-3) (1 ad rye ("2-3) dx +/ a! ¥(m4-3) al me ao’ E23) dz’ 
( aoe = ee ee | 5s) ee eer 
te ( B [$(m—3 > 4(n2— 3)} 

Now the numerator will certainly be less than the denominator if ¥< 1— rv, 
for then the two integrals do not make up the complete B-function. Hence our 
condition is that 





° 
Gg hy 
1 O17" No 
> 
G2 Ne G2 Ny 
l+— l+ | 
Oy hy Gy Ne 
os?\2 
9 
or that l< ( "2) : 
ol 


This is a condition always satisfied, since we have started by supposing 037 > o;° 
and asked how much greater that ratio may be than unity, without being too im- 
probable for both samples to have a common source. 

We now turn to the problem of the difference v—u. Taking Equation (Ixxxi), 
let us put 

utv=X, v—-u=Y, 
then the surface becomes 
W= Wo end V2-M0) ¥ ed tyd X (XY — Yyn (X+YV)y2 [dXd Y}. 
Now we desire to integrate this with regard to XY from Y to « , keeping Y constant, 
for the portion of our surface, precisely as in § (3), p. 295. Now write X = Yt and 
the limits of ¢ will be 1 to 0; thus the frequency surface for Y, which gives the 
difference, is 
foo 


Z= Wy eR Ya Y Yrutrett| e-dortyd Fe (¢ —1)a (t+ 1)2dt ...(1xxxviii). 
41 
This curve is allied to the 7',,(@) Bessel Function curve and passes into it when 
T= 7T2= 7, but as far as we are aware has not yet been studied. Its consideration 
is left for another occasion. If 7, =72,=7 and we write 
—_ ea 


Y’ =3(y1+ 42) Y, 
—* p Yer 1 


and m=Ttt+ + 
we have, by the proper choice of wo”, 

z= M(1—p*)"*t e ps Pate) Ws 1 acct (Ixxxviii bis), 
T,,(Y’) not changing sign with Y’, and when we put Y’ negative we get the other 
section of the area we need to integrate to get rid of X as in §(3), p. 295. This curve 
has been fully discussed in Biometrika, Vol. xxi. pp. 168—187. Its area when 
integrated for Y’ between — 0 and + equals M, and we have its ordinates tabled *. 


See Biometrika, Vol. xxt. pp. 195—201, or Tables for Statisticians and Biometricians, Part IT, 
pp. lxxix—Ixxxviii and pp. 188—144. 


Biometrika xx1v 


te 
to 
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In terms of the difference 


1 M (41 y2)"*? 
aaa aT 3 2r+1 
(1 + Y2) 
This curve for practical purposes may be replaced by a Pearson curve of the same 

first four moments after 7 = 11*. 


eto 7,44 (b(q +92) VY} [d¥]...(Axxxix). 


— 


Illustration. It may be asked for what type of variates would it be adequate 
to have t1= 72? We reply at once for 7? and R*, when the sizes of the samples are 
considerable as compared with the number of arrays in 7? or the number of variates 
involved in R*, This is the case where we are justified in using Equations (Ixxv), 
and consequently (Ixxxix). If n be the number of arrays or the total number of 
variates, and NV, NV’ be the sizes of the samples, then we have 

1 =T2= T=} (n—-83), 
while ym =4(NV—n—-2), yo=3(N’ —n-2), 
and the curve of distribution of, say, n’* — »? will be given by 
z=1M = = 3) —9~ 9009 ao nrg, ot (N+N’—2n—4) Y}[a¥], 
14 (N+ N’)—n—-2}" ae 


where BIE... hoskisenccrbintncbnaenstinienne (xc), 





and we assume the parent population to be without association in its characters. 

There is small difficulty in plotting this curve from Dr Elderton’s Table of 
ordinates, and at present the planimeter or a quadrature formula must be applied 
to determine the requisite areas beyond the values + (’*— »?) obseyved. An im- 
portant point arises from both the Equations (Ixxxviii) and (lxxxix), namely that 
when the sizes of the samples are unequal, then the difference of two statistical 
measures does not give a symmetrical curve, but one which like that of the distri- 
bution of the first product-moment coefficient may be notably skew. The fuller 
discussion of these curves must, however, be left to a further paper. It may be 
asked: Why, if the ratio-test gives an adequately simple expression for the proba- 
bility, should we deal further with the more complicated expressions for the 
probability of the difference? The answer, we think, is that the results of the two 
tests are often not so closely in accord that we can be confident one may not for a 
particular case be more correct than the other. The probability deduced leads at 
once to a frequency, and the touchstone of the more correct test is that it should 
give this frequency the more exactly and more often. The only way of determining 
this is the experimental one. This would not be hard in the case of the range of 
x's based on pairs of samples taken from a uni-variate population. It would, how- 
ever, be far more laborious in the case of sample contingency tables taken from a 
bi-variate table. Still it is to be hoped that such work will eventually be undertaken. 

(22) General Conclusions. 

The main purpose of the present paper has been to discuss an alternative to the 
ratio method in dealing with a number of statistical coefficients, such as x’, o*, Z?, 


n*, R*, ¢*, which on certain hypotheses as to the parent population obey accurately 


* loc. cit. p. 181. 
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or approximately the Type III form of distribution. We have seen that the difference 
of two like statistical coefficients follows in its distribution the 7, (Y) Bessel-Function 
curve, or in certain cases the more general skew form e~*Y T,,(Y). We have provided 
a table (Table I) of the probability integral of the former curve, by aid of which the 
probability of a given difference can be rapidly found. We have postponed discussion 
of a stiil more general curve, i.e. (Ixxxviii), to a later paper. 


A number of suitable illustrations were chosen by one of our number at random; 
that is to say without knowledge of what would flow from them, and the difference 
and the ratio methods both applied. The results show that in the great majority 
of cases the difference test is more “stringent” than the ratio test. By this we 
merely understand that the probability of the observed result is less by the former 
test. Under such conditions, however, it is reasonable to suppose that preference 
ought to be given to it. At any rate it justifies the use of the difference test along- 
side the ratio test. In order to simplify the application of the latter test, it has 
been approached by a new method and a table (Table IT) is provided by which the 
probability on the ratio test can be at once determined. We have further prepared 
a diagram by aid of which it can be rapidly ascertained which of the tests will be 
found the more stringent in a particular case. 


In the course of our work we have pointed out difficulties which occur in using 
the y? methods, and warned the student of difficulties which may arise if the two 
series from which x? is obtained are of very different sizes in the case when their 
relative sizes are wholly arbitrary; we indeed question whether, when the relative 
sizes are not “naturally” fixed, we ought to use y* at all*. In the case of the 
comparison of two x*’s we have with some diffidence suggested a possible method 
of overcoming some of the difficulty. The reader will see that much yet remains to 
be done—especially experimental spade-work—to obtain satisfactory tests for either 
the difference or ratio of these statistical coefficients in the case of different sized 


samples, especially small ones, when the parent population is unknown. 


* There is another point about the usual x? distribution frequently overlooked. Given two series of 
v categories each and sizes N and N’, then the maximum possible value of x? is N+N’. Accordingly the 
distribution curve is limited, and should take some such form as 


Ons at ing baiee (Ke SP. AME. eee epeeeN SR Nem Toate ~lceoclitan 


$(v—3) p—4x* 


rather than ¢ 


Y¥ =o (3x°) 
If N+N’ be large, and then only, approach will be made to 

P,=4 (x-1) (A-1)-1 and po=4(N+N)(A-1)-1, 
but in our case \=2 and «=v. Thus (e;) becomes 
j 1 


: x? 3(N+N’)-1 
—y.’ (12)? (e-3) 2 ) 
y=yo (4x?) (2 (NEN) 


which takes the form of (e,) if N+N’ be very large. The Type III curve is therefore no more exact in 
the case of x2 than it is in the case of »? or R?, it depends upon the large size of N+N’. This point is 
frequently overlooked, but it is actually involved when we replace the binomial distributions by normal 
curves in our deduction of (e,). 
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TABLE II. Values of I, {4(v—1), }(v—1)}, v=2 to 25 and «="01 to “50, jur determining 
the Probability of Ratios, e.g. x'*/x?. 
«x='01—°50 
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x= *01—-"50 v=8—13 

| | 

| v=8 9 10 11 12 13 

| : b(v-1)=3'5 | 4 4°5 5 55 6 : 

—_— ; ee: -| 

| “Ol *000 001 83 | °000 000 34 “000 000 06 -000 000 O1 “Ol 

‘02 | +000 020 26 ‘000 005 34 *000 001 41 “000 000 38 “000 000 10 “000 000 03 “02 

| 03 | 00008212 | +000 026-36 “000 008 52 “000 002 77 -000 000 90 *000 000 29 “03 

| 04 | -000 220 32° | -000 081 28 000 030 18 ‘000 011 27 000 004 23 000 001 59 | “04 

| ‘05 | +000 47153 | -00019358 | -00007999 | 00003322 | -00001386 | -000 00580 | °05 

| °06 | ‘000 87469 | +000 391 49 *000 176 34 “000 079 84 ‘000 036 30 -000 016 56 | *06 

| °07 ‘001 469 99 ‘000 707 24 | +000 342 42 “000 166 62 *000 O81 42 “000 039 92 ‘O07 

| "O08 | °002 298 07 | 001 176 28 | “000 605 82 *000 313 58 “000 162 98 “000 085 OOF | °08 
09 | 003 399 457 | ‘001 83658 | -000 998 30 “000 545 31 “000 299 08 -000 164 59 “09 

10 | :00481396 | +002 728 00 “001 555 21 “000 890 92 | “000 512 41 000 295 71 10 

| | | 

I 255 | “006 580 38 | +003 891 63 “002 315 12 001 383 83 “000 830 43 ‘000 499 98 

| "12 | 008736 05+ | -005 369 26 | 003 319 17 002 061 48 001 285 31 “000 803 99 12 

| 13 | #°01131660 | +007 202 82 -004 610 61 *002 964 92 -001 913 91 “001 239 42 13 

| ‘14 | °014 355 68 | +009 433 86 | -006 234 16 ‘004 138 37 002 757 43 ‘001 843 09 “14 

| 15 | °01788479 | -01210317 “008 235 49 "005 628 66 | 003 861 14 “002 656 86 15 

| °16 °021 933 07 | °015 25028 | +010 660 61 °007 484 70 | *005 273 88 ‘003 727 40 “16 

| 917 7026 52717 | -01891311 | °013 555 36 ‘009 756 81 | *007 047 55+ “005 105 78 17 
18 | -03169116 | 02812764 | -016 96483 “O12 496 17 “009 236 46 “006 847 03 18 
19 | +087 446 37 | *02792753 | +020 932 88 ‘015 754 09 ‘O11 896 70 ‘009 009 50+ | *19 
20 | -04381141 | -03334400 | -025 501 63 ‘019 581 44 “015 085 40 *01l 654 21 "20 | 
21 | -050 802 04 039 405 31 | °030 711 00 *024 02796 | ‘018 859 97 014.843 99 | “21 | 
22 | 058 431 15+ | 04613681 | -036 598 30 029 141 65+ | +023 277 34 018 642 73 | °22 
23 "066 708 79 | °053 560 62 | °043 197 81 *034 96819 | °*028 393 20 023 11440 | °23 
24 | +075 64209 | -061 695 47 | +050 540 44 041 55031 | °034 26118 028 32219 | 24 | 
25 085 235 33 “070 556 64 | °058 653 40 048 927 31 | °040 932 12 034 327 51 | "25 | 
26 | +095 48993 | +080 15578 | +067 559 93 *057 134 50- “048 452 32 7041 189 04 | °26 | 
“27 | °106 404 49 090 500 89 | +077 279 O1 ‘066 202 79 | °056 867 88 “048 961 83 27 | 
28 | °117 974 83 ‘101 596 24 | +087 825 23 ‘076 158 25+ *066 213 99 ‘057 696 37 28 | 
29 | °130 194 04 "113 442 36 ‘099 208 53 ‘087 021 75+ ‘O76 524 39 ‘O67 437 74 29 
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37 *249 634 99 *234 O81 58 2 *207 215 12 "195 478 62 "184 669 86 37 
“38 ‘266 97013 | +252 104 62 "226 223 66 “214 825 60 "204 272 46 "38 
“39 *284 752 43 | °27068694 | °257 -246 022 66 *235 079 98 29489918 | °39 
-40 *302 950 64 289 79200 | +277 722 72 -266 567 68 "256 194 90 -246 501 87 | °40 
“41 *321 53196 | +309 380 70 298 191 47 287 809 02 | °278 11657 269 02354 | ‘41 | 
"42 *340 462 20 | +32941161 | +319 200 28 *309 692 05- *300 784 62 *292 398 79 *42 
*43 | *359 705 85+ “349 84112 | +340 697 63 *332 157 58 *324 132 59 "316 554 32 43 
“44 *379 226 29 *370 623 73 362 629 057 *355 142 26 "B48 088 40 *341 409 58 “44 
“45 *398 985 857 *391 712 20 “384 937 507 | *378.579 05+ | +372 574 95+ "366 877 42 | “45 
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“000 000 O1 
“000 000 10 
“000 000 60 
“000 002 44 


‘000 007 58 
“000 019 64 


“000 044 
“000 090 
‘000 171 


46 
84 
14 


“O60 301 88 
“000 504 31 
“O00 804 83 
*OO1 2: 


‘O01 833 03 


“O02 641 24 
*003 708 45+ 
°005 O88 43 
‘006 839 73 
*009 025 04 





“O11 710 56 
‘O14 965 O7 
O18 859 04 
"023 463 59 
"028 849 42 
"035 O85 64 
"042 238 66 
“050 370 99 
“O59 540 16 
‘O69 797 56 
‘O81 187 50+ 
093 746 17 
*L07 500 857 
*122 469 15+ 
*138 658 39 
*156 065 13 
°174 674 82 
"194 461 65- 
°215 388 47 
°237 406 98 
*260 457 95+ 
*284 471 69 
*309 368 62 
*335 059 97 
“361 448 66 
“388 430 23 
*415 893 90 
"443 723 76 
"471 799 96 
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*000 000 03 
“006 000 23 
“000 001 03 


-000 003 48 
“000 009 68 
“000 023 32 
“000 050 26 
“000 099 29 


‘000 182 71 
“000 317 09 
*000 523 85+ 
“000 829 80 
*OO1 267 55- 


‘001 875 80 
“002 699 49 
‘003 789 71 
‘005 203 53 
‘007 003 56 
‘O09 257 38 
‘012 036 80 


015 416 95- | 


°019 475 2: 
“024 290 14 


“029 940 04 
‘036 501 79 
044 049 36 
“052 652 47 
‘062 375 21 
*073 274 66 
‘O85 399 64 
‘O98 789 54 
*113 473 24 
“129 468 23 
"146 779 78 
*165 
*185 309 557 
*206 473 17 


"228 843 95+ | 


*252 361 44 
"276 952 44 
*302 531 68 
*329 002 56 
*356 258 18 
*384 182 49 
*412 651 52 
"441 534 89 
*470 697 27 
*500 000 00 


*000 000 O01 
“000 000 09 
“000 000 43 


“000 001 60 
“000 004 79 
“000 012 25 
‘000 027 87 
‘000 057 73 
‘000 110 82 
-000 199 79 
‘000 341 67 
“000 558 56 
‘000 878 26 


“001 334 80 
‘001 968 83 
002 827 82 
‘003 966 12 
‘005 444 77 
‘007 331 16 
“009 698 41 
‘012 624 57 
‘016 191 67 
*020 484 48 
*025 589 25 
°031 592 19 





*038 577 95+ 


‘046 627 96 
055 818 76 
"066 220 38 
‘O77 894 69 
‘090 893 88 
*105 259 06 
"121 018 94 
°138 188 78 
*156 769 47 
‘176 746 88 


“198 O91 45+ 


*220 758 00 
*244 685 83 
*269 799 10 
"296 007 46 
*323 206 91 
*351 280 94 
*380 101 86 
*409 532 32 
°439 427 07 
*469 634 78 
*500 000 00 


“000 000 03 


“000 000 18 
*000 000 74 
*000 002 37 
“000 006 45+ 
-000 015 48 
-000 033 63 


"000 067 34 
“000 126 12 
“000 223 26 
“000 376 66 
‘000 609 61 
*000 951 48 
“001 438 39 
“002 113 61 
“003 027 93 
°004 239 757 


“005 814 91 
‘007 826 37 
°010 353 56 
‘013 481 54 
°0O17 299 84 
“021 901 19 
"O27 
*033 830 47 
“O41 2 
"050 012 54 
“059 915 58 
‘O71 
‘083 719 39 
‘O97 739 68 | 
°113 231 13 
*130 220 07 
*148 716 95+ 
*168 715 38 
"190 191 42 | 
*213 103 18 
-237 39078 | 
*262 976 59 | 
289 765 87 | 
‘317 647 66 
*346 496 07 | 
*376 171 82 
"406 524 00 | 
437 392 14 
“468 608 41 
"500 000 00 | 


“000 000 O01 
“000 000 08 


“000 000 34 
“000 001 18 
“000 003 40 
“000 008 62 
“O00 019 62 


“000 040 98 
“000 O79 74 
“000 146 11 
“000 254 40 
000 423 75 
“000 679 29 
*OO1L 052 45- 
‘001 582 13 
"002 315 06 
‘003 306 16 


“004 618 74 
‘006 324 38 
*008 502 51 
*O11 239 78 
“014 629 02 


“O18 768 05- 
*023 758 09 
"029 702 04 
‘O36 702 44 
*044 859 37 
"054 268 23 
‘065 017 36 
“O77 185 82 
‘090 841 13 
*106 037 17 
*122 812 26 
*141 187 52 
“161 165 45+ 
*182 728 93 
*205 840 51 


*230 442 12 
"256 455 25 
“283 781 46 
"312 303 36 
*341 885 97 
*372 378 46 
"403 616 22 
*435 423 24 
"467 614 74 
*500 000 00 


} 


*000 000 03 


“000 000 16 
“000 000 58 
000 001 80 
*000 004 80 
-000 O11 47 


“000 024 98 
“000 050 49 
-000 095 76 
“000 172 06 
“000 295 03 


*000 485 63 
°000 771 11 
-001 185 88 
‘O01 772 34 
“002 581 46 
"003 673 2 

-005 116 96 
“006 990 8 

“009 381 85+ 
‘O12 384 78 


“016 101 16 
*020 637 80 
"026 104 99 
“032 614 43 
“040 276 94 
“049 199 94 
“059 484 84 
°071 224 36 
“084 499 88 
‘O99 378 86 


"115 912 48 
*134 133 51 
*154 054 43 
*175 666 06 
*198 936 49 
*223 810 52 
*250 209 65+ 
*278 032 49 
°307 155 73 
*337 435 62 
*368 709 89 
*400 800 13 
*433 514 52 
“466 650 88 
“500 000 00 
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2 =°01—°50 v= 20—2. 

| | 

| | v=20 21 22 23 24 25 r 

| . | 4 (v—1)=9°5 10 10°5 | 11 11°5 12 . 

a, ee ee ee | Ce ee! ee S ee 

| 

‘Ol “01 

| -02 | | “02 
03 | 0: 
04 | | “04 
“05 -000 000 01 “000 000 01 “05 
06 ‘000 000 07 -000 000 03 *000 C00 O1 “000 000 O1 “06 
*O7 “000 000 29 -000 000 14 -000 000 O7 “000 000 O04 “000 000 02 -000 000 01 *O7 
.08 “000 000 95+ “000 000 50+ *000 000 27 “000 000 14 *000 000 07 -000 000 04 ‘O08 
“09 “000 002 68 “000 001 50- “000 OOO 84 “000 000 47 “000 000 26 “000 000 15> “O09 
“10 ‘000 006 71 -000 003 93 “000 002 30 “000 001 35+ ‘000 000 80 “000 000 47 “10 
‘11 ‘O00 015 25- “000 009 32 “000 005 70 -000 003 49 “000 002 14 ‘000 001 31 V1 
"12 ‘000 032 O1 “000 020 32 ‘000 012 91 ‘000 008 21 “000 005 2: ‘000 003 33 "12 
13 “000 062 85 ‘000 041 29 *000 027 16 “000 017 88 *000 011 78 *000 007 77 13 
14 “000 116 53 -000 079 O1 *009 053 62 “000 036 43 “000 024 77 “000 O16 86 “14 
“15 ‘000 205 65- “000 143 51 “000 100 25~ | *000 070 09 -000 049 05* *000 034 36 15 
16 “000 347 61 “000 249 10 -000 178 69 “000 128 30 “000 092 20 “000 066 31 16 
‘17. | 000 565 66 ‘000 415 42 ‘000 305 38 “000 224 70 ‘000 165 47 ‘000 121 96 “az 
18 “000 889 94 “000 668 58 -000 502 78 -000 378 44 “000 285 08 “000 214 92 18 
‘19 ‘O01 358 45+ “001 042 34 000 800 56 ‘000 615 42 ‘000 473 48 ‘000 364 55+ 19 
*20 *002 017 96 ‘001 579 12 001 28 90 *000 969 70 *000 760 83 ‘000 597 39 20 
21 ‘002 924 66 ‘002 331 03 ‘001 859 63 001 484 857- “O01 186 54 “000 948 85+ 21 
22 004 144 68 ‘003 360 54 *002 727 27 ‘002 215 21 ‘001 800 70 ‘001 464 80 22 
23 *005 754 21 )04 741 03 ‘003 909 79 ‘003 226 96 “002 665 41 "002 203 14 23 
24 ‘O07 839 40 ‘006 556 89 005 489 06 “004 598 86 *003 855 91 ‘003 235 21 *24 
25 ‘O10 495 75+ “008 903 28 ‘007 558 96 ‘006 422 71 "005 461 25 “004 646 85- 25 
26 °013 827 25+ “O11 885 44 ‘010 224 94 ‘008 803 24 007 584 63 ‘006 539 O01 26 
27 ‘O17 945 O1 ‘015 617 59 ‘013 603 19 “O11 857 54 ‘O10 343 10 ‘009 027 87 27 
28 “022 965 53 ‘020 221 26 017 819 13 ‘O15 713 86 ‘O13 866 69 “O12 244 30 28 
29 ‘029 O08 64 *025 823 31 *023 005 5: “020 509 85 “O18 296 S81 “016 332 51 29 
*30 ‘O36 195 O01 *032 553 36 ‘029 300 01 ‘026 389 94 *023 784 00 ‘021 448 00 *30 
“Sl "044 643 52 “040 540 98 *036 842 03 *033 502 81 ‘030 484 93 *O27 754 67 31 
*32 ‘054 468 26 049 912 51 045 769 58 *041 997 357- “O38 558 78 *035 421 14 "32 
33 065 775 56 ‘O60 787 68 ‘O56 215 44 “052 018 98 “048 163 08 “044 616 457 33 
34 ‘O78 660 87 °O73 276 06 “068 303 28 ‘063 705 29 “059 449 03 *055 505 05 “34 
“35 093 205 72 ‘O87 473 60 “O82 143 66 ‘O77 181 507 *O72 556 54 ‘O68 241 42 “35 
“36 *109 474 85 *103 459 16 ‘097 830 02 “092 556 03 ‘O87 609 19 ‘O82 964 44 *36 
37 127 513 53 *121 291 33 "115 434 93 “109 915 98 *104 709 14 ‘099 791 71 *37 
*38 147 345 27 *141 005 53 *135 006 62 °129 323 12 *123 932 38 *118 814 04 3t 
"39 168 969 90 *162 611 64 *156 565 91 *150 810 19 *145 324 36 *140 090 39 39 
*40 192 362 12 “186 092 02 *180 103 87 °174 377 87 *168 896 34 *163 643 44 “40 
“41 *217 470 63 *211 400 28 +205 580 00 "199 992 54 *194 622 51 *189 456 14 “41 
*42 *244 217 84 "238 460 66 *232 921 30 "227 584 86 *222 438 11 ‘217 469 18 *42 
“43 *272 500 17 ‘267 168 16 *262 022 11 *257 049 35- °252 238 59 247 579 79 "43 
*44 *302 188 97 ‘297 389 37 292 744 90 "288 244 94 *283 880 06 *279 641 85t "44 
“45 *333 132 07 “328 964 09 324 921 83 *320 996 61 *317 180 75 °313 467 357 *45 
“46 365 155 90 *36] 707 62 358 357 27 "355 O98 03 *351 923 82 *348 829 257 “46 
"47 “398 O68 17 *395 413 72 392 831 04 *390 315 05- “387 861 26 "385 465 66 °AT 
“48 *431 660 95+ *429 858 18 *428 102 39 *426 390 22 “424 718 7O *423 O85 19 “48 
19 *465 714 30 +464 802 84 *463 914 59 +463 047 90 "462 201 28 "461 373 41 *49 
50 *500 000 OO “00 000 OO *500 000 OO *500 000 00 *500 000 00 *500 000 OO *bO 











EXPERIMENTAL DISCUSSION OF THE (,’, P) TEST FOR 
GOODNESS OF FIT. 


By KARL PEARSON. 


(1) Introductory. In the Philosophical Magazine for 1900* I published, I think 
for the first time, a test which has since been much used for statistical purposes 
and has come to be spoken of as the (y?, P) test. The problem I had in mind was 
the following one: Samples of WV are taken from a population classed in v categories, 
the chance of drawing an individual from the sth category being p,; how will the 
cell contents 71, Ng ... Ns... Ny of the samples distribute themselves, and what is the 
probability, P, that samples may be drawn which deviate more from the mean 
parental values than a given sample? 

With certain limitations the answer was shown to lie in the distribution of y? 
by the curve 

y=ywldx ree [dax)}) 
»_ *o°(n, — Np,)? Po. Sesh sdtwoakeereeetes (i). 


where x= 8 Saat 
e=1 Nps 





A short table was given in my original paper, and a longer one shortly after- 
wards computed by W. Palin Elderton+, by means of which it was possible to 
compute P from a known x? and v (or 7’ as it was termed at that date). In dealing 
with the formula for y? the p,’s might be those of a real parental population from 
which the sample had been actually drawn, or a hypothetical population from which 
we question whether it could reasonably be supposed to have been drawn. 


Some time after this (1911) I published a second paper which discussed the 
problem, whether two observed samples of the same ©: different sizes, but with the 
same number v of categories, could reasonably be supposed to have been drawn from 
the same parent population. The problem is straightforward if the parent population 
be known. If it be not known, what values can be used to represent it? With some 
diffidence I suggested the values of the combined samples might be used to replace 
the parental population values. It was shown that in either of these cases we might 
use the (y*, P) test entering with v cells (=n’ of the (x*, P) Tables). The point to 
be emphasised here is: That when we use the known parent population, the two 
samples cannot be written as a contingency table, for the marginal total of v cells is 
not the sum of the contribution of each sample to a given cell, thus it is not a marginal 

* Vol. u. pp. 157—175. 

+ Biometrika, Vol. 1. pp. 155—163. See also: Tables for Statisticians and Biometricians, Part I, 
Table XII, 3rd Edn. 1931. 
t Biometrika, Vol. viii. pp. 250—254. 
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total in the contingency table sense. For this reason it is in my view a mistake to 
look upon every comparison of two samples or two series as a contingency table; it 
only becomes a contingency table in form when we, ignorant of the sampled or parent 
population, make the doubtful hypothesis that the latter population can be replaced 
by the sum of the two samples. This is so often overlooked in the text-book 
treatment of the y* test for the co-origin of two samples that it is desirable to 
emphasise it. 


But the biserial table formed from two samples differs seriously in another 
respect from a true contingency table. In the latter case, according to my 
envisaging of it, we start with a parent population and, drawing individuals in 
succession from it, table them according to two (or it may be more) characters. 
Thus in successive samples, say of size N, it is not only the contents of the « x X» 
cells formed by « categories of: one character and 2X of the other which vary from 
sample to sample, but also the marginal totals. In such a case there is only one 
degree of constraint on the contingency, arising from the size of the sample N. 


In a paper published in 1916* I dealt with what I termed “ partial contingency,” 
namely, cases in which linear relations between the contents of certain numbers 
of cells existed from sample to sample. I proved in such cases that not only the 
n’ by which we enter the (x?, P) table must be reduced by the number of such 
linear relations, but the observed x? itself might also according to the nature of 
these constraints have to be reduced. To express the matter algebraically let ng, be 
the sample frequency in the sth row and tth column cell of the sample, and if pe 
be the chance of drawing an individual from the s, ¢ cell of the parent population, 
the expected value in the s, ¢ cell of the sample will be ig = Vp, and if 
. 


| 
= V vs say, | 


¢* - (Nest — Mat ? 

s,t N Vbst ; (ii) 
wai gee f ¥asientenduebensebnegens ii). 
we have N(1+¢@)=N+%7= . Se | 

Test 

Now the value of ¢? or y? will depend entirely on what value we give to jig. My 
idea was to give it such a value that ¢* would lead (subject of course to the error 
of random sampling) to a measure of the association of the two characters or variates 
in the table. In order to achieve this I assumed the parent population to have no 
contingency or association between its variates. In this case iy = Np,. pr = tis. T/N, 
where p,. and p. are the respective chances of an individual being drawn out of the 
sth row, and an individual being drawn out of the tth column of the parental 
population, and 7,. and 7%, are the respective relative frequencies of row and column 
for samples of size VN. It will be seen that %,. and 7, are not so far the sum of the 
sth row and tth column of any particular observed sample. We have 


N(1+¢)=N+2=NS ("= ) el (iii), 


Ng. Net 


Now if we do not know the parent population, we may follow one of two courses: 


Biometrika, Vol. xt. pp. 145—158 and pp. 159—190. 
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(a) Assume f,. and %., are constants, values for the unknown parent population, 
and determine on the basis of their being constants the mean and variance, etc. of 
¢* and x? in terms of the unknown algebraic quantities typified by 7, and 7. 
Finally in the formula so reached we are compelled to insert in want of better 
information the values of the ii,., 7. actually reached in the observed Table. 


(b) Assume that 7,. and 7. are for each individual sample replaced by the 
sample values; we thus get a different definition of ¢? and y*, and the mean and 
variances etc. will not be the same in cases (a) and (b). For in the latter case we 
have to allow for the variation of n,. and n. in the formula 


2 
N (1+ ¢*)=N+x¢= Ws ("* ) cadike Secssinscnalin 
Ns. Net 

¢* has been discussed from the standpoints of both (a) and (5) in a series of previous 
papers in this Journal*. What I want to emphasise is that if we start with (iv) as 
a definition of ¢? and sample a parent population by taking individuals out one at 
a time and recording their characters, we obtain samples in which there is no fixing 
of either marginal total column. On the other hand, if we draw two independent 
samples, say, of boys and girls for eye colours, one from a population of boys, another 
from one of girls, we are dealing with a wholly different method of sampling. We 
can form a spurious contingency table out of these two rows with 2v cells, but 
theoretically we are limited to v cells, as I showed in the original treatment of the 
problem, and little appears to be gained by saying we have introduced v conditions 
of constraint. 


(2) Goodness of Fit. I now turn to the main topic of this paper, the application 
of the x" test to the problem of ~ ge vodness of fit.” Here again divergence of opinion 
seems to be largely based on difference of aim and definition. 


Suppose we take a random sample from a population, the whole of which 
we cannot observe or measure. The object of the anthropologist or craniologist is 
to ascertain how far, when making comparison with samples from other racial series, 
he may replace the not-fully measured parent population by his sample. In other 
words, we have two populations A and B, and we have samples a and 6 from them. 
We want to ascertain how far we can suppose A and 6 to be alike by a consideration 
of whether a representing A is alike to b. We are bound by the conditions of 
affairs to observe a sample of A; it stands for us as A, but which is not really A. How 
far does the fact that it is a sample only of A preclude us from ascertaining whether 
a sample b of B could with any probability have been obtained from A. This is the 
everyday problem of anthropologist, sociologist and most statisticians, but it is also 
the problem of “goodness of fit,” and it is indeed the problem by which I originally 
reached the (x*, P) test, only the paragraph dealing with it reads obscurely and has 
been largely overlooked. Let us suppose we have a parent population of v categories 


Biometrika, Vol. v. pp. 192—203 (1906, with J. Blakeman); Vol. x. pp. 570—573 (1915) ; Vol. x1. 
pp. 215—230 (1916, with A. W. Young}, and this corrected, Vol. x11. p. 260. 
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with probabilities of pi... ps... Pv, that a sample of NV has been drawn with 
frequencies ny ... Ns... Ny, and that the moment coefficients about any point of this 
sample have been found jy’, ue’ ... wz’, the variates corresponding to the frequencies 
being 2 ... #,... %». Then 


Ny 24" + Neto" +... + Ny Xp" = Nyy’, 


but it will not = NM,’, where M,’ is the uth moment coefficient of the parent 
population. Every fresh sample will give a fresh series of moment coefficients, 
which will not equal those of the unobserved or unknown parent population. There 
is, thus, in this case no question of the limitation of the “degrees of freedom.” When 
does such limitation occur? Only when we know the parent population, and there- 
fore its moments, and fit various curves to that population by aid of t of its moments. 
We then have ¢ linear relations among the cell frequencies, and must reduce our 
“degrees of freedom” by that number. But surely this is not what we usually 
require? We do not know the parent population. We know a sample of it. To this 
sample we fit a curve and our problem is: How far may we use this curve to 
represent the unknown parent population? How far will further samples give 
corresponding x” and x’? when compared with the true parent population and with 
the sample from it? 


We here reach two very important points: 
(a) The distribution of the fitted curve to any sample gives a far lower 


x* when compared with the parent population than the raw sample from which it 
was constructed. 


(b) The distributions of y? for the parent population and of y” for the fitted 
curve of the sample deduced from any fresh samples are such: (i) that the mean 
difference of y* and xy” is small and (ii) that the correlation of y* and 
is extremely high. I attempted to give a proof of this in my original paper, a 
proof which has been considered obscure, but should have more or less indicated 
what my problem was. It was not “goodness of fit” of the curve deduced from 
the sample to the individual frequencies of that sample, but that of the y” dis- 
tribution treated as an approximation to the x? distribution that I had in view. In 
other words, I was considering and still want to consider how far we may replace 
the unknown parent population by the frequencies of the smooth curve deduced 
from a sample. For this purpose I have in the present paper selected a normal 
curve to represent the parent population with a standard deviation of 10. Luckily— 
for it saved me much labour—Dr Egon 8. Pearson had somewhat over 1000 samples 
of 15 drawn from such a population by aid of Tippett’s Random Sampling Numbers*, 
and I am very grateful to him for allowing me the use of them. He had computed the 
proportional frequencies for a central tenth of the standard deviation and for thirty 
such tenths on either side of the central group. Such a distribution is not an exact 
normal distribution, but it is very close to it; thus its standard deviation was 100283 
without correction, and 99909 with Sheppard’s correction, both close enough to the 


* Tracts for Computers, No. XV. Cambridge University Press. 
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value 10 of the actual curve. As a matter of fact we may consider this distribution 
the parent one; there is no special merit in considering it an exact normal curve. 
Out of Dr Pearson’s 15,000 odd samplings from the above parent population I took 
eight basic samples, none of which covered the same ground, they were independent 
samples from the parent population. These samples were of sizes 600, 300, 150, 
105, 60, 45, 30 and 15. I term these the basic samples. The actual frequencies 
occurring in each sample I term the Raw Basic Samples. Each of these eight 
samples was fitted with a normal curve and the frequencies recomputed from this 
normal curve. These distributions I term the Smooth or Graduated Basic Samples. 
Finally, for a purpose to be explained later, I reduced all the frequencies of the 
Smooth or Graduated Basic Samples to a total of thirty. These may be referred to 
as the Graduated Basic Samples reduced. The parental population was reduced to 
a total thirty. I then proceeded to take 100 samples of thirty from the data. These 
were independent of each other and of the eight basic samples, ie. all resulted from 
completely independent drawings. Of course had time and energy permitted, it 
would have been advisable to have had a large number of basic samples of each 
size and more than 100 samples to compare with them, but what has been done 
involved the computing of 900 y*’s. and that means much labour*. The y’*’s 
obtained from the smooth basic samples were then compared with the y?’s obtained 
from the same series of 100 samples as against the parent population, and eight 
correlation tables were thus obtained. The close relationship between the y’* from 
a smoothed basic sample and the x? from the parent population became at once 
manifest, and there were very few cases in which one of the hundred samples would 
on the measure of its probability have been rejected or retained as a sample of the 
parent population when it would not in the same way have been rejected or retained 
by any one of the smoothed basic samples. In other words, the curve provided by 
the basic sample is a “good fit” to the parent population, and to judge by the present 
experience we are reasonably safe in replacing the unknown parent population by 
a graduated basic sample. Now the moment coefficients of the basic samples are 
not the same as those of the parent population, nor have they the same values for 
each basic sample. What has happened is this, that in calculating the distribution 
of the y*’s of the 100 samples we have replaced the parent population frequencies 
by those of the smoothed basic sample, and the result has shown that we shall not 
make many or frequent errors of judgment in so doing. I have only been able to 
take eight samples of different sizes as an illustration. Had it been possible to take 
50 or 100 samples of each size, we should no doubt have seen the advantage of the 
large over the small basic sample with respect to its goodness in representing the 
parent population. Asa matter of fact in two basic samples one of 30 may be better 
than one of 300, though with a large number of samples those of 300 would certainly 
as a whole be better than those of 30 +. 


* The bulk of the computing work was done for me by Dr L. T. Woo, but Mr Georg Hansmann 
undertook one series. 

+ The basic samples of 15 and 30 are for their size extremely favourable, i.e. give results much more 
accordant with those of the parent population than would be anticipated. 
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In obtaining the values of x? it was needful to limit the number of categories, 
and ultimately 15 categories were selected each of ;3,ths of the standard deviation, 
namely: 











| Below | —19°5 | —16°5 | - 135 | -10°5 | -75 | —45 | -15 
| ~195 | -165 | -135 | -105 | —75 | —45 | -15 | 415 
| | | 
i} oe 
Central Values Zz . -15 . ho oe oe o | 
| | 











| | 
+45 | +75 | +10°5 | +13°5 | +165 | Above 
75 | +10 | +13°5 | +165 | +19°5 | +19°5 | 
| 





| | is | 
Central Values | +3 | 





It was desirable to have a considerable number of categories, but 15 categories 
was rather a large number for samples of 30, and it might have been better to 
increase the number of individuals in the test samples, as the small frequency 
in some of the categories would militate against the theoretical justification of 
replacing binomials by normal curves. As this would apply equally to all the basic 
samples and to the parent population, and we were dealing so to speak with 
relative values of y? and y’*, I do not think it will affect the vaiidity of our results, 
‘2 T had also in mind another 
reason for choosing 7, to be small, which will appear later. 


as it will influence x? to much the same extent as y 


(3) Experimental Details. Table I gives the actual data for each basic sample 
in four columns. The first of these gives the raw basic frequencies (R.B.F.), the 
second gives the frequencies (G.8.F.) of the graduation—in this case a normal 
curve—replacing the original sample (R.B.F.), the third column gives the corre- 
sponding frequencies of the parent population (P.P.F.), and the fourth column the 
reduced basal frequencies (R.G.F.) for a sample of 30. 


In calculating the y? between the parental population and a basic sample, raw 
or smoothed by its curve, the full number of the sample was used, the relative 
frequencies of the parent population being modified to give the total of the sample. 
In treating the basic sample in its turn as a parent population for the 100 samples 
of 30, the reduced graduated values of the basic sample were of course employed. In 
the computing of x* and the resulting P, where given, there is no question of any 
constraint beyond the size of the sample. Each basic sample has its own moment 
coefficients, and they are not the same as those of the parent population, nor of 
another basic sample of the same size. 


Table II gives the mean and standard deviation of the raw data from which the 


smoothed frequencies were computed. 
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TABLE II. 


Means and Standard Deviations of Basic Samples. 























Mean Standard Deviation 
Parent chee 
Population 
0 10* 
: Size 600 +°4250 10°2275 
3 5» 300 +°6300 9°5291 
eT » 150 | +°9000 10°1380 
3 | — aa + °4286 9°4334 
w GO +8000 10°3252 
= | 4 — °5333 110041 
S a a — *5000 | 9°5795 
ieee +6000 =| 96867 _ | 





It will be noted that only two of the means are negative; the odds against so 
small a number of negative signs are only about 5 to 1; yet should there be a 
series of rather improbable cases arising from Tippett’s Random Sampling Numbers, 
we must remember that that series itself is a random sample, and may be a rather 
unusual one +. Table III compares the x's and P’s as found from the Raw Basic 
and Graduated Basic Samples as against the Parent Population. Two points at once 


TABLE IIL. 


Goodness of Fit of Basic Samples to Parent Population. 











Basic Sample Raw Basic Sample Curve from Raw Basic Sample 
ae eee ey ees ie ronev ede 
i | | | } 
| Size x? P (n’ =15) x? |  P(n'=15) 
| - = a = rc t- | 
} 
600 82890 | "8725 “999,943 
| 300 14°6609 | “4024 -999,682 
150 98821 ‘7703 999,975 
105 | 10°1715 “7491 7782 | >-999,999 
60 19°6397 *1427 “D115 | >°999,999 
15 9°8996 “7691 1*0000 “999,999 
30 12°3442 "5788 -1868 >+999,999 
15 | —10*8809 6951 "1462 | >+999,999 








* This was the standard deviation of the parental population curve from which the relative 
frequencies of this population were calculated. Working back from these computed frequencies to their 
standard deviation, we find 10°0283 for its value, = 9°9909 on applying Sheppard’s correction. Corre- 
sponding to this the standard deviations recorded are all corrected values, and the relative frequencies of 
the basic samples were computed from the means and these corrected standard deviations. 

+ This caution is not given wholly unadvisedly. I have not myself made much use of Tippett’s 
numbers, but recently I obtained in 100 trials three such unusual samples that only one should have 
occurred in 1,000,000 trials, 
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result frem this table. First all the raw basic samples are, as of course they really must 
be, probable samples individually and as a group from the parent population. 
Secondly the curves fitted from the raw basic samples to the parent population—note, 





not to the raw basic frequencies themselves—are most excellent fits. They can be 
said to represent with a high degree of accuracy the parent population. The 
experience represented must, I think, be of interest and of real value to the anthro- 
pologist, who can rarely if ever measure whole populations, but has always before 
him the problem of whether a certain sample can be considered as belonging to a 
population he only knows from the graduated frequencies of another sample. We 
are not concerned here with the goodness of fit of a graduated curve to its raw 
sample, but of the goodness of fit of a graduated curve based on a raw sample to a 
graduated parent population from which the raw sample has been drawn. What 
we are considering in this case is the goodness of fit of a graduated sample to a 
graduated normal population, there is no limitation of the degrees of freedom, for 
the moments by which the graduation is determined change from sample to 
sample *. 


(4) Goodness of Fit of Graduated Basic Samples to Raw Busic Samples. Here 
there is a point to which attention is not always given, or, perhaps, not sufficient 
attention. Many years ago+ I showed that if two samples n,, NV, n,’, N’—s corre- 
sponding to the sth category out of v categories—were taken from a parent 
ran individual at random from the 


Oo 
is 


population in which p, was the chance of drawit 
sth category, then if 





(Ms n, \2 
2. § NW WN’ (v) 
pf ke N + N’ eee ee ne eae ’ 
we have x? distributed according to the curve 
Yy = Yoe dx" (4x2)? >) [d (3x")] reTre tt TT (V1). 


But an essential condition of this result is that the series p, is to be considered constant 
throughout the serve Ss of pars of samples. It is only under th se conditions that the 
constants of y*, for example its mean and standard deviation, can be supposed given 
by the above distributiont. In applying the y* test to two samples, it is always 
well to consider what we are assuming our parent population to be. We may of 
course put for p, any series of values we please, and can find the probability that 
the two samples belong to the corresponding population. If we have no knowledge 
of the parent population, we can use as the best substitute available for p, the sum 
of the two samples, but our result is bound to be unreliable if those samples are 
not considerable. 

I may note here that I have often been asked: What is the value of so much curve fitting to 
samples? The answer is more or less conveyed in the present paper where we can see that the graduated 
basic sample effectively represents a parent population, even in the case of relatively small samples, and 
SO Serves as a standard for measuring the degree of divergence of -ne population from a second. This 
should be done not by comparing raw, but by comparing graduated basic samples. 

+ Biometrika, Vol. vu, pp. 250—254, 1911. 
t There is a still further limitation, n, and n,’ must not be correlated. 
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On this latter assumption the value taken for y? will be 








n n,’\? 
vn’ (22) 
Seg CS ss 
S - : 
e=] Ns + Ns 


But if we do this we must remember that if we take another pair of samples 
indicated by 7i,, 7’, the corresponding 


v 


~ ~ 4.9 
NN' (3 Is ) 
7 : 
x” is not S 


NN’ 











— ~ ’ 
s=1 Ns + Ns 
& “ 
WN’ (3-55) 
* . N N 
but is equal to S - : 
pat Ng + Nz 


otherwise the distribution of y* is not given by 


—4t,? 2 L(v-—3 2 
y=ye ** (4x7)2""*) [d(dx")]. 
Hence obscurity seems to me to arise when we write 


ny Ne _ | | Ny N 
ny’ Ne 1 vee | Neg. ewe Ny N : 
! | 
| 
| 
| 


(N +N’): | (N +N") ps (N+N')p, |... | V4N)p, | N+N’ 


and replacing the horizontal totals by n,+ ,’ speak of the result as a “contingency 
table” with v constraints. It is true that if we write 


ny Ne a | Ng Ses Ny N 
ad 

s-. || @ tur *& | ee Ny | NN 
} \ 


m+n | Netne | ... | MtMy |... | NHN, | N+ N’ 


this single pair of samples forms a contingency table* with 2v cells where the 
previous method has only v cells and we speak of 2v cells with v degrees of constraint. 
But the next or any other pair of samples will give 

: § 


. ' ze \ 4 ' 
Wy | ile eee Ihe | eee Thy | N 
~ §f ~ 7 ~ 7 yT? 
ile ae Ibs eee Ny N 


, | | r TY? 
Ny + ny Ng + Ne’ vee | Ng te | vee | Nyt Ny N+N 


and although the horizontal totals are still fixed, this is far from being a contingency 
table unless 7, + 7,’ =n, +n,’ for every pair of samples. 


The relation y? = (N + N’) ¢? holds for the first pair of samples, and this only 
when we replace (NV + NV’) p, by n, + n,’. Such a relation as that written down leads 
the student to believe that x? is always proportional to ¢?; for every other pair of 
samples beyond the first pair this is not true, and accordingly it seems to me a 
misfortune to discuss the matter under the heading of a contingency table; it 


confuses in the mind of a student the difference between a x” test, with its pseudo- 


* The equality of (N +N’) ¢* and the x? above is of course easily demonstrable. 








Kart PEARSON 361 


contingency table based on a narrow hypothesis, and the true sample contingency 
table, where the marginal totals vary, and the only limitation is the single one, ie. 
the size of the whole table. For these reasons I much prefer not to look upon the 
two sample test as a case of a contingency table, but as a comparison of the 
difference of the relative frequency of two samples with a certain parent population. 
Naturally this leads the student to define clearly what his parent population is 
supposed to be. 

Now in Table III we have given two illustrations of “Goodness of Fit.” First 
we have the raw basic samples and we compare them with the parent population 
for 15 categories. There is no doubt in this case that there is no limitation in the 
way of constraints beyond the size of the raw basic samples, and we look up Elderton’s 





Table with n’=15. Secondly we have tested the graduated against the parent 
population and we have used two moments, not of the parent population but of the 
raw sample; clearly except for the fractions such a sample could arise directly from 
sampling the parent population. Such a sample would be rare, and its goodness of 
fit is made obvious by the smallness of its y*. But we have no constraint; the 
moments of the next sample will differ from those of this one. We have by fitting 
by moments only selected one of the possible samples of the parent population, 
and we find that there are few samples better than it with regard to the fit to the 
parent population. To get 





in particular from a small sample—the best possible 
approach to the parent population may be a difficult problem, but whether we 
graduate by two or four moments we are not restricting the number of degrees of 
freedom, we are simply selecting a possible sample out of endless possible samples. 
Where then does restriction of the degrees of freedom arise? Only as far as I can see 
when we fit by the moments of a sample a series of curves to the sample using the 
same moments in each case and the same number of categories; then the curve 
with the lowest y? will have the best fit. But the P of the x* table must be looked 
up under n’ less the number of moments used*. This is however not the case I 
personally have had in view when considering “goodness of fit” ; I want to ascertain 
how close the graduated sample is to the parent population, not to its raw sample. 
How far in the case of unknown parent populations can a measure of further 
samples from the parent population given by y?’s be replaced by y’”’s the measure 
of departure of these samples from the graduated first sample? This problem will 
be answered in the next section; the object of the present section is to consider the 
graduated basic sample in relation to the raw basic sample. In our case the 
graduated sample has been fitted by two moments: are we to reduce n’ by two 
constraints? Clearly we are not fitting a series of curves to the one sample, we 
have selected a normal curve and we are not asking whether it is a better fit than 
a parabola or a sine curve. We must first determine which curve in the present 
comparison is to stand as the parent population. Obviously it cannot be the raw 
basic sample, for in that case it may have zero frequency in certain cells and 

* For example, we might fit as graduation to a sample either the curve y=y,e~?* T,, (w/b) or the curve 
of Type IV y=y,e7¥*/(a2+ 22)"; then if the categories were n’ in number we should have to reduce n’ 
by four in ascertaining their P’s. 
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accordingly the frequency of the normal curve could never have been obtained from 
it: that is, the normal curve would be an impossible sample. We must treat the 
graduated basic sample as our parent population, and ask what is the probability 
that the raw basic sample could be drawn from a parent population with the 
relative frequencies of the graduated sample. How we have obtained that parent 
population, whether by guess-work or by moments in any number, does not come 
into the problem. Here is a parent population, and here again is a sample which 
could be drawn from it: what is the probability P of samples like the present or 
more remote? It seems to me that this is a reasonable problem, and that it is the 
problem we usually desire to answer in curve fitting, rather than the question of 
the comparative fit of two curves determined by the same number of moments. If 
so, the process by which we have reached our parent population is a matter of 
indifference, we have no restriction of our degrees of freedom, beyond the size of 
the sample. I get my graduated basic sample—not to test it against other processes 
of graduation—but to see how far it may replace the unknown parent population 
from which the raw basic sample was drawn, and I do this by testing 100 experimental 
samples from a known parent population against that population and against the 
graduated basic sainple to ascertain what is the relation of their y's. 


Table IV gives the Goodness of Fit of the Raw to the Graduated Basic Samples 
in the cases of the eight basic samples. It will be seen at once that the fit is good, 
and it should be, because our samples are owing to the equality of moments good 
ones—but the fit is nothing like as good as the fit of the graduated basic samples 


TABLE IV. 


Fit of Raw Basic to Graduated Basic Samples. 





Raw Basic to Graduated Raw Basic Samples to Parent 
Size of | Basic Samples Population 

Basic | = 4 ' 
Sample | l | | 
x? Pp | Order x" P | Order 

650 7°2899 “9216 Ist 8°2890 "8725 Ist 

300 | 13-0665 “5214 7th 146609 “4024 7th 

150 | 88009 “8427 3rd 9°8821 ‘7703 =| 2nd 

105 | 10°5628 °7193 ith 10°1715 ‘7491 4th 

60 18°8529 1711 Sth 19°6397 °1427 8th 

15 7°3810 “9174 2nd 9°8996 ‘7691 3rd 

30 1199425 | -6109 5th 12°3442 | ‘5788 | 6th 

15 12°1290 | +5960 6th 10°8809 | °6951 | 5th 








to the true parent population. This table shows results of considerable importance. 
The two orders are very nearly the same, the only interchanges being that of the 
2nd and 3rd into the 3rd and 2nd, and of the 5th and 6th into the 6th and 5th. 
As we see the Basic Sample of 300 was a bad one and that of 45 an especially 
good one. No doubt had we been able to take a large number of basic samples 
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of 300 and of 45, these results would have been averaged out, and samples of 300 
and of 45 put in more appropriate order. Table III shows us that for practical 
purposes there is very little to choose between the fits of all eight graduated basic 
samples to the parent population, we cannot place them in order without recalcu- 
lating the (x*, P) table to more figures; all however give us a fit measured by 
P>-9996. The same four stand at the top in both orders and the same four at the 
bottom. As a rough rule we may therefore say that the raw sample which fits best 
its own graduation fits best the parent population. In other words, if we are 
seeking the “best” out of a number of samples from an unknown population, that 
best will be roughly indicated by the degree of goodness of fit it bears to its own 
graduation. Most investigators would say: “Oh, but a sample of 300 must give a 
better representation of an unknown parent population than one of 45!” It is of 
course true in the long run that we shall get better results from samples of 300 
than from samples of 45. But in the present instance we have a case in which an 
individual sample of 45, both in its raw (Table IV) and graduated form (Table ITT), 
is a better fit to the parent population than a sample of 300. Of course it is 
needful for both samples to be true random samples from the parent population, 
not in any way selected for graduation and they must be graduated by the same 
process. 


(5) Parent Population and Graduated Basic Samples tested against 100 further 
Samples of 30 drawn from the Parent Population. This is the main part of our 
experimental work, wherein we strive to determine the degree of accuracy with 
which the graduated basic sample can be used as representative of the parent 
population. It is in its turn to be treated as a parent population and the 100 
samples from the original parent population will be tested by their y* from the 
latter population and by their y’* from the graduated basic sample as a spurious 
or step-parental population. 

We shall investigate (a) the mean difference of +? — x”, (b) its standard deviation 
ox2_x2 and (c) the correlation of y” and x, rx2,x2. As the range of y* and y” is very 
considerable, and the correlation tables could only be formed and published for 
fairly considerable subranges, these were taken as unity for y? and y™. To deter- 
mine the mean and standard deviation of y*— y’*, the actual differences were taken 
and grouped in subranges of 0°2. Thus we find that mean (x*—”) is not exactly 
equal to mean x? — mean y” as given by the correlation tables, nor 

Oy2@_y2 = O72 + oy — Qo Ox’ Px8, x2 
as given by the same correlation tables. The accordance however is good. Corre- 
lation Tables A—H tabulate the experience, and Table V gives the chief constants 
obtained in the manner described above. 

The regressions are approximately linear, and accordingly the constant 

ox2 V(1 — 7?x2,x2) 
as well as the regression coefficient of y’* on x* or Rya,x2 = ox2®1x2,x2/ox2 have been 
added. 








1 


| 


Size of in 0-2 intervals 
eh Se ee 
| Samay | 
Mean x?- x” Oy2_y2 
600 + °018 (ii) *7889 (i) 
300 — °996 (vi) 1°6212 (v) 
150 | - °398 (iv) 1°3046 (iii) 
105 + 1°938 (viii) 1°6578 (vi) 
60 — ‘164 (iii) 1°2521 (ii) 
45 + ‘017 (i) 1°6738 (vii) 
30 — *429(v) 1°5585 (iv) 
15 — 1°280 (vii) 1°9747 (viii) 





364 Experimental Discussion of the (x°, P) Test for Goodness of Fit 


TABLE V. 


Constants of the Distributions of x? for the Parent Populations and of 


| From Distributions of x?-x? 








x? for the Basic Samples as Step-Parent Populations. 





For the Parent Population: Mean x?= 12°68, o 


From the Correlation Tables A—H grouped for x? 
in unit intervals 





| 
Pp 2 





Mean x” | Oya | Ty? 2 | oy 2 Ji - ry2 x2 
12°66 | 4°6402 | 985,727 (i) "7812 
13°61 | 5:4474 | -979,674 (ii 10927 
13°07 | 4°9135 | -966,350 (iii) 1°2639 
13°82 | 5°5222 | *966,108 (iv) 1°4255 
12°90 | 4:8017 | :964,567 (v) 1°2669 
12°71 4°2264 | -934,983 (viii 1:4991 
13°15 5°1229 | °939,327 (vii) 1°7573 
13°99 | 5°7277 | *952,176 (vi 1°7501 


= 4°6265. 


and x”? 


“988,646 
1°156,107 
1°026,296 
1°153,149 
1°001,094 

*854,126 
1°040,112 
1°178,813 


Now examining this table, it will be seen that the average difference between 
x’? and x? is small, and that the variation in the difference is not great. The sixth 
column of the correlation coefficients shows how highly x” and y’” are correlated, 
the lowest correlation occurs with the sample of 45, but even this is greater than ‘93. 


TABLE VI. 











Regression Lines of y'* on *. 
Size of . ; ‘ Size of . . ‘ ‘ 
Sample Regression Equation Sample Regression Equation 
600 x? 0712+ 98865 y?+0°53 60 x2= 0°21+4+1°00109,? +0°85 
300 x? = —1°05+1°15611 x? +0°74 15 x2= 1°88+ °85413x? +1°01 
| 150 x? 0°06 + 1°02630 x? + 0°85 30 x2= —0°044+1°04011y? +1°19 
105 x2 = — 0°80 + 115315 x? + 0°96 15 x7 = —0°96+4+1°17881 xy? +1°18 
In any case we can deduce y” from y‘, or y? from x’? with greater accuracy 


than we can find in human beings any character of the right side from a knowledge 
of it on the left side; for example, the length of a right thigh bone from a knowledge 


of the length of the same bone on the left. I think any one who studies the correiation 


coefficients, the correlation tables and the regression lines will agree that as far as 


determining whether a sample B comes from an unknown population A, of which 


we have only a sample C, say of 60 or upwards, we shall rarely be wrong in our 
diagnosis, if we ask whether B could have been drawn from the sample C after 
graduation. That is to say, in the value of y?, 


’ m.\2 
9 g (Ms — Ns) 


Sieg Mace aaa 


Ns 


s=1 
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we replace the unknown %,’s series by 7,’s, where the latter are drawn from a 
graduated sample of the unknown population. 


There is however a point to be noticed here. The distribution of y* given by 
y = yor *%*(2y)2-5 depends on the 7,’s being the means of the n,’s, and this is not 
correct, although a comparison between P.P.F. column under “Sample of 30,” with 
the R.G.F. columns in all the samples in Table I will show that the differences are 
not large. Now the mean y* in samples with v=15 categories ought to be 
v—1=14, and o2= V2 (v—1)=52915. A consideration of columns 4 and 5 of 
Table V shows that the y’’s from Graduated Basic Samples only approach these 
values very roughly. The mean of their means for y” is 13°2389 and their mean 
ox? = 5°05025. It might be plausibly argued that this is due to 7, not being i,, 
but when we come to the sampling from the parent population where 7%, has 
actually been used we find instead of a better correspondence a worse one, namely, 
mean y*= 12°68 in place of 14, and ox:= 46265 in place of 5-2915*. It seems 
impossible therefore to attribute the divergence to %, not being equal to %,. There 
are two explanations which may account for the non-fulfilment by mean y? and o,2 
of the theoretical values. First we have used 15 categories throughout, and this 
seemed a necessity for purposes of comparison; further, the categories were not 
unsuitable when we were comparing the larger samples directly with the parent 
population or with one another in their raw and graduated forms. But it is not so 
satisfactory when we test the parent population or the graduated samples against 
the samples of 30, as the theoretical values in some of the categories become small. 
Some experimental work, however, seems to indicate that not very great effect is 
produced by the small categories. The second point is that it is due to the 
approximate nature of the curve y= yor ** (4y*)? ns 

While the true y2=v—1, the deduction of the variance of y* as 2(v—1) 
depends on the above curve being applicable, which actually it is not. There is a 
N (N — ii) 

my 
the sample, and i, the least relative cell frequency of the parental population t. Thus 
in order to get the customary equation for y* with an infinite range we require to 
make 7; as small as possible, but to do so is to disregard the principle that %, must 
be relatively large compared to NV in order that we may replace the binomial by a 
normal curve. We are thus thrust on the horns of a dilemma. If we say that 
(5+ %)% is the most skew binomial that can reasonably be represented by a normal 


id 


curve, and we take n,=;N, then y= 9, and for a small sample we may doubi 


whether it is legitimate to treat this as an infinite range. 


limit to the value of x? which is, I think, x? = where WN is the size of 


> 


* The divergences are of course not impossible, but they point in one way; actually they are for 
difference of means 132+: ‘36 and for difference of standard deviations 665 + -252 approximately. 

+ In the case of the 100 samples of 30 compared with the parent population, x,2=1220, and this 
range might be treated as approximately infinite for our purposes, but to obtain it we have infringed the 
condition as to replacing binomials by normal curves, 
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The actual value of the variance of y*, when in Equation (i) the %,= Np,’s are 
the true mean of the n,’s, is given by* 


1 v2 1 ss 
22=2(y— ae ee S " 
o*x2 = 2(v—1) (1 w) wt S( -) seoseobeeetockaeee (vii). 


Hence the usual value 
o*x2 = 2(v—1) 


may be modified in two ways: (a) if the sample be small, and v be not small, the 
negative term v*/N may be by no means negligible; for example, if v=15 and 
N =30, the term v*/N =7°5, which cannot be neglected as compared with 27:07; 
iy Sa ee <a : ; = 
(b) on the other hand the term S (=) is additive, and if we are dealing with a small 
Ne 
sample and with a fair sized v, this may be considerable. For example, in the case 
: ; : . ] > 
of our experimental parent population reduced to a size of 30, S (; ) = 10°597, so 
that the theoretical variance in that case is 30°17 instead of 28, giving ox2 = 5-493 
instead of 5:292, and differing still more from the observed 4°627, the deviation 
being 3:4 times its probable error. Even with a sample of 50 in 10 categories so 
1 


8 


chosen that no category contains less than 4, and thus S ( ) reduced toa minimum, 


the term v?/N will still be 2, and this is not negligible as compared with 15°68. 
The use of o?x2=2(v—1) and consequently of (vi) in the case of small samples is 
certainly to be deprecated. 


I again hazard the suggestion that the better distribution of y? in such cases 
is to be found from the curve 


Y = Yo (ANT) (Ay — FV Pt occ cescecccscscsceeeecs (viii), 
where x? = N(N — %)/t, 


Xr | o"x2 


feo en a 
pr+1=%, x x) —1}, 


o*x2 


2\ (2(n 2 — <2 ) 
prt+i=(1-%) * (xz al, 


~9 ' P ‘ 1 y . oee 


Ns 

The Table of the Incomplete B-Function will provide the requisite probability P 
for a given y’. 

We can see easily how (viii) passes into a Type III curve if y;” be large. In that 


case we have approximately 


o"~x? Xr * 4 
Han? i (tf eee ee ereeerercenees (1x), 
Pe 
53 o*y2 xr 


* Biometrika, Vol. x1, Equation (xii), for the value from a limited parent population, and 7, not 
necessarily the mean of n,, and as above for the case of n, = mean of n,. 
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1 v 1 1 
mnere 4 ee — -— — —~ —— S = ‘ 
where r ae byt 2-1) (=) 
1 
4 v-1 , 1.2 aX"> 
Hence y = yo (4y”) . (1 ~ x.) ee (x), 
2X 


or since y?* is large we reach 
eR ae ~4% 
ae am 2 
y = yo (4X° e 
v—-l1 x? / 
iamee ch ake 
= y(t | 
= 9 
2 — x | 


It is accordingly ee which is approximately given by a Type III curve, and 


bo 


: y—] —_ ; : 
the power is not $(v—3), but 3 a 1; the probability P will be easily caleu- 
lated from the Incomplete T-Function Table. In the case of the samples from the 
parent population of this paper \ = 1-07737, and since x? = 1220, the transition to 
(xi) is reasonably legitimate. But the usual x? will be in error to about 8 °/,. 


The justification for (viii) lies in the fact that it gives the true start and range of 
the x? curve as well as its true mean and variance. It will probably account with 
considerable accuracy for the binomials not closely following a normal distribution ; 
and with the Tables of the Incomplete [- and B-Functions the P corresponding 
to (viii) or (xi) may be obtained as quickly as from Palin Elderton’s Table. 


The whole subject is worthy of further experimental investigation, for if my 
conjecture as to the approximate accuracy of (viii) and (xi) be verified, the use of 
the x” test could be extended to small samples and small cell frequencies, which 
are not suitable in the case of the ordinary (x7, P) process. 


The fundamental experiment of the present paper is in part intended to 
illustrate the need for widening the nature of Equation (i). No discriminating 
investigation could be based on the present data without increasing much beyond 
100 the number of samples taken. 


(6) Actual Comparison of the P’s from a Parent Population and from a 
Graduated Sample from that Population. My original intention was to publish side 
by side the P’s determined from the y”s and y's of the Parent Population and the 
eight Graduated Basic Samples. But the large amount of labour and of printing 
involved in computing and publishing 900 P’s induced me to confine my attention 
to a single sample, which I have taken a good way down the list to indicate that a 
relatively small basic sample, say 50 to 100, if graduated, will provide a reasonable 
indication of whether further samples do or do not belong to the unknown parent 
population. The 100 values of P for the 105 Basic Sample are given in Table VII. 


The problem turns here on how many samples which belong really to the parent 
population would have been rejected by the graduated basic sample. Suppose first 
we take the 2°/, standard. No. 82 would be rejected by both P and P’ tests if 
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TABLE VII. 


Comparison of the Probabilities of 100 Samples drawn from a Parent Population 
and again supposed to be drawn from a Graduated Basic Sample of 105. 








Sample 
Index No. 





Cornaok WN 





1g wS/E_ eles | 28/— oe) 2d [2-5/2 .2| 2s [E2318 
gees \fage| oc |8882 (See) o% |BebS|Eaee] 2% (2288/2 
SEES SESE] Gs [VSeslSSsal gy |SSkslSSsa] ey |e Szsiss 
jan Se) a2igas|s so ahi Sal ezinas|s so ai Sal zing Sry ae SS | 2s 
[SOA 5 SMa ae | Uae SMa ae ara Seal ae | aoe S| = 
1 Ry Pal Ay | Pa} te ay ATA, Ss Re Pa A, 
| *763 “754 26 720 710 51 “815 “804 76 “827 710 
*204 "125 27 ‘609 | -581 52 679 “637 77 *303 “200 
681 | °584 28 | °462 "434 53 | °704 -718 78 -209 “046 
232 | °153 29 | -763 700 54 | -990 ‘971 79 697 “A84 
202 “281 30 ‘775 =| +820 55 *D79 ‘376 80 “846 “878 
‘421 *221 31 | -819 ‘718 56 | +098 024 8] 630 “546 
“846 “872 32 44] *453 S7° | wae ‘746 82 “O19 “O07 
942 | ‘932 33 "054 066 58 “439 "526 83 “469 *270 
245 | +165 34 | -°310 ‘283 59 | -932 | -962 84 "445 229 
10 } "830 | “O67 35 “460 *360 60 | °128 | °041 85 “Sia ||, *°O0N 
“688 | °725 36 | °567 “409 61 | -382 | -373 86 ‘973 | -982 
12 “918 ‘879 37 | 320 "307 62 | ‘117 | 026 87 ‘774 | 830 
13 “900 746 38 | :194 *240 63 | ‘381 | -461 88 "822 | -766 
| *060 ‘071 39 | “651 | °615 64 "105 ‘019 8 “788 | *721 
‘779 657 40 "192 | °087 65 879 | °873 90 ‘776 | °728 
> | *309 | °169 41 “848 *780 66 "456 *305 91 *708 *499 
| °747 "625 12 279 | -161 67 | -889 “937 92 902 873 
18 | 987 ‘976 43 | -999 “999 68 | ‘866 | -835 93 “359 233 
y) +390 ‘414 44 "285 “404 69 869 | -894 94 “803 644 
20 | ‘857 | "854 45 | +297 *338 70 | °743 | °721 95 | +174 | -073 
| *567 “598 46 | °575 492 71 | :4387 +| °383 96 “711 =| °722 
22 | +286 -167 47 | -106 | -060 72 180 | ‘112 97 "284 | 241 
23 *850 *895 48 | °586 ‘678 73 *852 | "806 98 "AT7 } °310 
24 | -829 | -791 49 | +945 ‘949 7 “622 “424 99 773 | °778 
25 666 | °652 50 | *859 | °884 75 | °281 | °168 §100 | ‘763 |: °75 
| j i 




















an isolated sample. No. 64 would be retained as a sample of the parent population 
and rejected as a sample from the basic sample population, had it occurred as an 
isolated sample. Actually it or something worse might be expected to occur twice 
in 100 samples. Thus dealing with a 2°/, limit and an isolated sample we should 
have made an error once in a hundred times in rejecting a sample from the parent 
population and twice in 100 times if we used the basic graduated sample in place 
of the parent population. If we used a 5°/, level, No. 82 would be the only 
sample rejected on account of its P value in the case of the parent populatiou, 
while Nos. 56, 60, 64, 78 and 82 would all be rejected on the basic sample test. 
The reader might hastily pass to the conclusion that the basic sample does not 
effectively represent the parent population. But the conclusion is rather that the 
present sampling is too favourable. A 5°/, level means that there are five cases 
in the 100 below it, the parent population shows only one, while the graduated 
basic sample actually records five. 


We may consider the matter from another standpoint. The distribution of the 


wrobability integrals of any continuous curve is a rectangle, every probability 
I J g y g : y 
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between 0 and 1 being equally likely*. Accordingly the distribution of P and P’ 
should be linear. Dividing into 10 groups we have the following scheme: 
TABLE VIII. 
Distribution of P and P’. 

















Probabili *00 | °10 | *20 | -30 “40 *50 “60 “70 | °*80 | °90 
ati |-10 | 20 | *30\-40|°50\-60\-70\ -80 | 90 | 1°00 Total 
| | | | | | 
Expected ... ma ..- | 10 | 10 10 | 10 | 10 | 10 | 10} 10 10 | 10 100 
Parent Population, P  ...| 5 | 9 11 8/10;} 5; 9/17 |16%5) 95] 100 
Graduated Basic Sample, ?’ | 9 | 10°5|9°5| 6/11) 6/85! 165] 16 i 100 | 


In both cases we find a redundancy of rather favourable samples. 
For the Parent Population: y?=1565, P=-075. 
For the Basic Sample of 105: y’?=12-40, P’=:193. 


The odds in the first case are about 12 or 13 to 1, and in the second case about 
4 or 5 to 1. Thus the Graduated Basic Sample gives the more reasonable result. 
Both are possible in the single isolated trial. 


The reason for there being less correspondence between the P and P” for the 
series of 100 samples lies in the low standard deviation of the 105 Basic Sample; 
see Table II. It is the lowest of all eight samples. Hence in the case of rare 
individuals being drawn from the Parent Population, they would be still rarer in 
the case of the 105 Basic Sample, and accordingly what is a rare sample from the 
standpoint of the Parent Population will be still rarer in the case of the Basic 
Sample, i.e. when P is small, P’ will be still smaller. 


The reader may ask for some evidence that the Normal Parent Population and 
the Basic Sample would correspond in the same manner when the samples tested 
were drawn not from the former but from an entirely different population. For this 
purpose I took a Rectangular Population, and not to protract matters too severely 
took only ten samples of thirty from it. The values of P for three assumed parent 
populations are given in Table IX. It will be seen at once that the values of P 
for both the Normal Parent Population and the Graduated Basic Sample of 105 
as parent population are on the whole strongly against the samples from the 
Rectangular Population being their offspring; the Graduated Basic Sample is even 
more strenuous than the grand-parental normal curve population, owing to the fact 
of its smaller standard deviation. Naturally the bulk of the contributions to y? come 

* Thus Bayes’ theorem applies accurately to such distributions of P, all chances being equally likely. 

+ This reduction of the standara deviation will be somewhat modified by the shifting of the mean, 
which is, however, nearly the smallest shift of the series, and this shift, if it compensates for the reduced 
standard deviation effect at one taii, will emphasise it at the other. 





—_ 


| 
|} 1 | o | w a a ek Be vi | var | Ix 
| | | 
< eeaers) a ae | ee : 

} | 

; Rectangular Populationas Parent | *6063 | -8311 | °5265 | +0193 0458 "2562 "6063 | 4497 | 3134 
T ammnal Chasen 4 ase > x | | ] 

| Normal Curve as Parent Popu 7418 | 0054 } 0060 | —-000,0005 <000,0005) <-000,004 | -0239 0024 -0011 


| Norma! Curve of Basic Sample 
of 105 as Parent Population 
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TABLE IX. 


Comparison of values of P from Samples of 30 drawn from a Rectangular Population, with their Purent 








Chance of occurrence 
if drawn from: | | 


lation 


| Index Number of Sample from a Rectangular Population 





from random variation in the extreme categories of the rectangle*. We may I think 
conclude that the Parent Population and the Graduated Basic Sample will give the 
same sort of judgment with regard to a population differing from either of them. 


But having said this, we examine our table further and begin to realise the 
weakness of small samples. Two true samples (IV and V) out of ten from the rect- 
angle would be rejected on the P =°05 basis, and one (IV) on the ‘02 basis. On the 
other hand two samples out of ten would be accepted as genuine samples of the 
Normal Parent Population, and one out of ten as a genuine sample of the Basic 
Sample of 105 on the ‘05 basis, and two from both on the ‘02 basis. Indeed, 
Sample I is a better sample from a normal population than from a rectangular 
population, its true parent. 

Of course such anomalies will occur, but if they can occur in ten samples from 
such very different distributions as a normal curve and a rectangle, must we not 
be somewhat anxious whether they will not occur, and more frequently occur, when 
we compare two small samples and assert identity of origin? The samples may 
really have arisen from two wholly different populations, but far more accordant than 
a rectangle and a normal curve! 


To illustrate this point I will deduce, by aid of the formula} 


(G-% ) 
NN’, eo Ter. ‘co Ne oe 
S aeeneiaeeel —~ gs — Je) ~ an N = N ; 
Js oe i IstJs 


The sampling was done by putting into a box 10 of each of the letters A, B, C...M, N, O on tickets. 
A ticket was then drawn, its letter recorded, and it was returned to the box. The box had then its 
lid closed, it was waved about, rotated and shaken, and then a second ticket drawn, and the process 
continually repeated until 300 tickets had been recorded. The two exceptional samples (IV) and (V) arose 
from the last letter, O, occurring seven times, and the last but one, N, seven times. Observation showed 
that it was not through faulty shaking, as it was not the same ticket repeating itself. Further, it could 
hardly be due to ‘‘clustering’’, as the tickets had been introduced into the box in a manner which avoided 


this, and tickets of the same letter did not follow in approximate succession. 


+ Biometrika, Vol. vit. p. 252. 


Population, the Normal Curve Parent Population, and the Graduated Basic Sample of 105. 


} "5607 | -O010 | 0045 ae < "000,005 | -0072 | -000,003 ‘000,170 | 
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the probability of our 7 raw basic samples of sizes 15, 30, 45, 60, 105, 150 and 
300 actually taken from a normal population being drawn from a rectangular 
population. 


Using the ungraduated raw samples, we have: 


TABLE X. 


Probability of Raw Samples from a Normal Population 
having a Rectangular Parent. 





Size of Samples | 15 | 30 45 60 | 105 | 150 300 
| 


| | 
Values of P| +8893 6783 "8306 ‘3244 "0326 0051 | <-000,0005 | 





It will be clear from this table that even with parent populations so different as 
those here dealt with, the x* test is inadequate to discriminate between raw samples 
from these populations, if the samples have not sizes of the order 100, and even 
then not on the ‘02 probability basis. Safety may be said to begin between 105 
and 150, and if 50 or below be said to be “small” sample sizes, it is not dogmatic 
to assert that the y?* test ought never to be applied to such small samples. 


(7) Conclusions. An endeavour has been made in this paper to mark more 
clearly the distinctions the writer had in view in introducing y* and ¢* into 
statistical theory and practice. 

. n ¥ ° (Ns Sak fis)* 

(i) If x? be defined as y= S —— : 
$=1 Ns 
then 7, is in a succession of samples a constant and equal to the mean value of n,. 
if this condition be satisfied, y? is given approximately, but only approximately, by 
the curve 

—ly2 1 (»—3) 2) 
y= pe a? 2™ [d (dx")], 
where v is the number of cells = n’ of Elderton’s (y*, P) table. 

The mean of x* is » —1, but its true standard deviation is not V2(v—1) but is 
given by Equation (vii) above. It is suggested that either Equation (viii) or 
Equation (xi) will give a better value for the P corresponding to a given y?, using 
either the Incomplete B- or [-Function Tables, than Elderton’s (y*, P) Table, when 
N is not very large or any 1/n, is not negligible as compared to unity. 


(ii) It has been pointed out that the main use of y* was intended to be the 
comparison of a considerable graduated sample of a parent population with further 
samples in order to test whether such samples were or were not likely to be 
samples from the parent population, only known through this graduated sample. 
It is shown from a series of experimental examples that the y’’s and P’’s from the 
graduated sample are very highly correlated with the y*s and P's from the parent 


* I mean the form of the x? test based on the distribution y=y,e~!**(4y2)"- 
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population, so that without frequent wrong judgment we may use the step-parental 
population in place of the unknown parent population. This amounts to using for 
i, the values found from the graduated sample population They still remain 
constant throughout the comparison with further samples, and the larger the size 
of the graduated sample the higher on the average will be the correlation between 
the y*’s from the step-parental and true parental populations. 

(iii) Incidentally it is pointed out that even for small samples an immensely 
better P is obtained from a graduated than from a raw sample, and this even when 
the size of the sample is small. 

(iv) More than twenty years ago I gave a test for two samples of sizes V and 
N’, categories n,,;) being drawn from the same parent population with relative 
frequency of the sth category p,. It consisted in calculating 

(nN, 1, \* 
i 22: (y Ht ¥) 
+N’ Ds 





there being v categories in both samples, and then applying the (x*, P) table. 
Here p, is supposed throughout the further sampling to be a constant. 


If the parent population be supposed unknown, I suggested that the best value 
available was p,=(n,+7,')/(N +N’); this would not be very accurate for small 


samples. 


9° 


j 2 Y . - N o N’ 
In this case = Ss 


‘ 
s=1 Ne + Ny 





But it has been frequently overlooked that in adopting this value of y*, we must 
in measuring P remember that in further samples the denominator n,+ 1,’ is 
supposed to remain constant, while we vary n, and n,’ in the numerator, otherwise the 
distribution of yx” on which the (y?, P) table is based is incorrect. The whole matter 
has in: my opinion been rendered unnecessarily obscure by writing the two samples 
in the form of a spurious biserial contingency table. This is said to have 2v cells, 
and to lack v +1 “degrees of freedom.” If the first pair of samples gives a contingency 
table the second will not, and from this manner of approach we lose sight of the 
true difference between a real and a spurious contingency table. In the former 
all the constituents of the marginal totals are free, and the only limitation is the 
total size of the sample in the table. 


(v) If we, however, write x” in the form of 





( W(t) (mg N’ ys 
v4 V") g?= . | Ns N + WN’ Ne nN. , . Ne NV + N ,(N, +N, )) 
tits ‘ N ; v 
1 V + ye Ms +m, ) s=l ar; apr (n +n,’) 


we are not only giving ourselves double work, but are apt to forget that to get the 


approximate equation for x" we must consider n, 4 ng constant in the numerator, 
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which it is not if we take another pair of samples. There is nothing whatever to 
prevent our choosing 

(3 %) 
eae wee’ SRY yg 
p =. 


s-1.N+WN’ n +n, 





as our measure of accordance of the two samples, making n,, n,’ vary in both 
numerator and denominator. But if this be done, the distribution of ¢? is not that 
of x*/(N + N’), and it cannot be deduced from the (x?, P) table. Even approximate 
values of the mean and variance are complicated, and the experimental study of the 
distribution of ¢* has only been started by Professor Kondo’s recent paper. 


(vi) If we take a sample, graduate it by aid of t moments and then compare it 
with any population, it is, apart from fractions of a unit, a possible sample from 
that population, and we are at liberty to look out P in the ordinary (y*, P) table 
and judge where it stands among other possible samples of the same size and the 
same categories. We have not limited anything by obtaining our graduated 
sample by moments. When we do limit by moments is when we fit a series of 
curves to a given distribution by moments, the curve moments being in each case 
those of the given distribution. In such a case we compare the relative goodness 
of fit of various curves obtained by ¢ moments and our degrees of freedom are reduced 
by #. An application of such limitation of degrees of freedom was made by me 
in 1915 and applied to the case of death-rates. It was shown in the memoirs con- 
cerned with this topic that in certain cases not only the degrees of freedom, but 
the value of x? with which the table was entered might need modification *. Another 
case of the modification of both v and x* to get a better measure of P is indicated 
in this paper: see pp. 366—367 above. 


(vii) We have also seen in this paper that it is still probably legitimate to 
calculate P from y* when, owing to the smallness of the sample and of its cate- 
gories, the distribution y= yo e~}" (47)! is no longer accurate. In such cases 
we are thrown back on frequency curves which are generalisations of the usual 
x*-curve, and which can be integrated by aid of the Incomplete T-Function and the 
Incomplete B-Function Tables. There is here a field for much experimental work of 
a useful kind. 

* Biometrika, Vol. x1, pp. 145—184. 


, : ; o 
Biometrika xx1v 24 
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48 Lene an & Pax 2 ° s om 5) 4 
x"? from Basic Sample of 150 and x? from Parent Population. 


Parent Population x? 
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ON THE DISTRIBUTION OF THE CORRELATION 
COEFFICIENT IN SMALL SAMPLES*. 


By PAUL R. RIDER, Washington University, Saint Louis. 


It was the original purpose of this study to attempt to discover the effect upon 
the distribution in random samples, particularly in small samples, of the product- 
moment coefficient of correlation, 7, when the samples are drawn from a non-normal 
instead of a normal population. In Part I the results of sampling from certain popu- 
lations which differ greatly from the normal are given, also the results of sampling 
from a normal population having a high degree of correlation. As the sampling was 
done experimentally, it was necessary to deal with discrete populations. This opened 
up the question of the effect of grouping upon the distribution of 7, a question which 
is investigated in Part II. 


I. SAMPLING FROM Non-NORMAL POPULATIONS AND FROM A NORMAL 
POPULATION HAVING HiGH CORRELATION. 


Description of the Populations sampled. 


The populations sampled will be termed rectangular, triangular, and normal, and 
will be designated by R, 7’, and N, respectively. 


Rectangular Population. The rectangular population may best be characterized 
by its correlation table, which is composed of 10 x 10 compartments, each with the 
same frequency. That is, there is equal probability, in random sampling, of obtaining 
any pair of values within a limited (rectangular) region. Obviously the correlation, p, 
in such a population is zero. If # and y are the variates, then for the marginal 
distributions we have 8; (x)= 81 (y) =, B2(x) = B2(y)=1°7757. (The dots indicate 
a repeating decimal.) For the corresponding continuous bivariate distribution, 
p=9, Bi(x)=Bi(y)=90, B2(@)= Be(y)=1'8. The frequency surface is a rectangular 
parallelopiped, or, with suitably chosen units, a cube. 


Triangular Population. The correlation table of the triangular population is also 
composed of 10 x10 cells, but the frequencies in all of the cells on one side of 
a principal diagonal are zeros, the frequencies in the remaining cells having a constant 
value different from zero. It is found that 


p=, Bilw)=Bi(y)= 0326, Be (w) = Be(y) = 2°36. 


* This investigation was made possible by assistance from a grant made by the Rockefeller 
Foundation to Washington University for Research in Science. The writer wishes to express his sincere 
thanks for this grant, and also to make grateful acknowledgment of valuable criticisms, suggestions, 
and assistance given by Dr Egon S. Pearson. 
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For the corresponding continuous distribution the frequency surface is a right prism, 
with a right triangle for a base. Its constants are 


p=%, Bi(x)=hi(y)=0°32, Be(x)= Be (y) =2°4. 


Normal Population. The normal population sampled is the one shown in Table I. 
Its frequencies were calculated from tables of volumes of the normal surface*. It 











TABLE I. 
Normal Correlation Table. 

Totals 
Biers Pee Sete See. fF: 1 29 32 62 | 
= — = 5 223 349 | 929 606 
ti ae 8 618 1566 | 223 1 2416 ; 

= 5 618 2582 618 5 ae 3828 

1 223 1566 618 8 ty “4 2416 

| 299 | 349 223 ca tee 606 

32 Ci 29° : 1 — ee aa 62 

Totals 62 606 | 2416 3828 2416 606 62 9996 





p=0°9 with Sheppard’s correction. 
p=0°83 without Sheppard’s correction. 


represents a population in which the coefficient of correlation is 0°9. On account of 
the grouping, the actual coefficient as computed from the table is 0°83, but when 
Sheppard's correction is applied to the two standard deviations involved in the 
denominator of the coefficient of correlation, its value is 0°9009. The values of 8; 
and §2 for the marginal distributions are 0 and 2°9390 respectively. With Sheppard’s 
corrections, Bs = 2°9367. 

It was suggested by Karl Pearson, in a letter to the writer, that it would be 
desirable to test by actual sampling, whether observed values of r actually follow 
the theoretical distribution when there is high correlation and grouped frequencies 
in the sampled population. It was the object of this particular experiment to test 
the theory. 

Results of Sampling. 

The sampling was effected by the use of Tippett’s numbers+. One thousand 
samples of 5 pairs each were obtained from each of the populations R, 7, and N. By 
clubbing these together in the case of 7’ and of N, five hundred samples of 10 pairs 
ach were obtained from each of these two populations. The observed distributions 
of r are given in Table IT. 

* See Tables for Statisticians and Piometricians, Part II, Table VIII, pp. 105—106. 


+ L. H.C. Tippett, Random Sampling Numbers (Cambridge University Press, Tracts for Computers, 
No. 15). 
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Distribution of r in Samples of 5 and of 10. 
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"85 
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Total 


1000 


TABLE II. 





Frequencies 





1000 
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500 


R, refers to samples of 5 from rectangular population 2. 
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L000 
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500 


T,, and 7 refer to samples of 5 and of 10 respectively from triangular population 


N; and My refer to samples of 5 and of 10 respectively from normal population JY. 
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Pearson curves were fitted to three of these distributions with the following 
results: for 1000 samples of 5 from the rectangular population, a Type IT curve* 


y = 1000 x 0°60632 (1 — 72)? eens (1); 
for 1000 samples of 5 from the triangular population, a Type I curve 
y = 1000 x 0°11157 (154516 + r= (092826 — r)-018H (2); 


for 500 samples of 10 from the triangular population, a Type IV curve 
x“ 

y = 500 x 019841 (1 + o50a73 

in which # = r — 09653. 

In cases (2) and (3) the fit was effected by equating the first four moments of the 
curve to the corresponding moments of the observations (without correcting for 
abruptness). In fitting the Type II curve to the distribution of r in 1000 samples of 
5 from the rectangular population it was found that 

7 =0°04685, pw. =0°2660225, ws = — 0°0001043661625, 
a= 0°13714900615625, 8; =0°00005817, B.= 1944985, 
<< “ 
a= (aa) =0:9895, m= ae = 0'34356+. 
Assuming that the small value of 8, justifies a Type II curve, we should get 
-— 004685 yr" 


—5°80245 
) e—7 ez tan—1 (z/o'5767) (3) : 


? 

y = 1000 x 0°596989 | 1 — (’ mee 
Y 09895 
It seemed more desirable, however, to fit a curve of the type y = yo(1—1*), deter- 
mining / and yp so that theoretical and observed values are equal for standard 
deviations and totals. It is readily found that 
i 1 ( 1 3) NT (2k + 2) NY (k+ 3/2) 
y= =< > Yor — or = —, 

2741 (TP (kK + 1)P Va C(ke + 1) 
N being the total number of values of r, that is, the number of samples. Equation (1) 
was derived in this fashion. 

Although the exact distribution of r for samples of size n from a normal popu- 
lation has been given f, the general distribution equation is somewhat complicated, 
except in the case when p= 0. In the latter case for the distribution of 1000 samples 
of 5 it becomes 

2 Aas ; 
y = 1000. = (1 — 1?) = 63662 (1 —r2)8 0... ee eee (4), 
7 
an equation which should be compared with (1). 

A graphical comparison of results is made in Figures 1—6. The graphs of the 
theoretical distributions of r for samples from continuous normal populations were 
plotted from tables of ordinates of the frequency curves of the correlation coefficient §. 

* See nex) paragraph. 
+ For notation see Elderton: Frequency Curves and Correlation. 
t+ R. A. Fisher: “ Frequency distribution of the values of the correlation coefficient in samples from 


an indefinitely large population.” Biometrika, Vol. x (1914—15), pp. 507—521. 
§ Biometrika, Vol. x1 (1915—17), pp. 379 ff. 
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In the case of samples of 10 from a normal population it will be recalled that 
the value of p, if Sheppard’s correction be not applied, is 0°83. The value of p in the 
corresponding continuous population is 0°9. The theoretical distribution curves for 
both p=0°8 and p=0°9 are plotted. The histogram representing the actual samples 


DISTRIBUTION OF f IN 1000 SAMDLES OF 5 


Q=0.5 [Correlation in sampled population] 

The histogram represents the observed  distribu- 
tion in samples from a_ triangular population. 

The solid curve is a Pearson curve fitted 
to the observations. 

The dashed curve is the theoretical distribution for 
samples. from a continuous normal _ population. 
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seems to lie closer to the curve corresponding to p= 0°8, and it appears that an even 
better fit would be obtained if a curve for p=0°83 were interpolated between the 
two given curves. 


DISTRIBUTION OF f IN 500 SAMPLES OF 10. 


e=0.5 [Correlation in sampled population | 

The histogram represents the observed  distribu- 
tion in samples from a_ triangular population. 

The solid curve is the graph of 
Y= 500 X 0.27275 (a+) 490% Pe its 

The dashed curve is the theoretical distribution for 
samples from a continuous normal  populafion. 
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*It will be noted that the Pearson curves fitted to the values of r from the 
samples drawn from the triangular population fail to have the proper range. It was 
consequently thought desirable to fit a curve of the type y=yo(1+r)s(1—r)*. It 


DISTRIBUTION OF f IN 500 SAMDLES OF 10. 


Q=0.5 [Correlation in sampled population] 
The histogram, represents the observed 


diStribu- 


in samples from a_ triangular 
is ao Dearson curve 
the observations. 


The dashed curve is the theoretical 


distribution for 


from a_ continuous 


a 


FREQUENCY 


ee a 
| re. 


ll 


Me. Sin 





















































eed 








ww? 


























Pau. R. RmeEr 389 


is found that if the fitting is done by the method of moments the constants have 


the following values : 
en eee 


ae 
(1-FP?(14+7)4+(F¥-3)07 
k.= = ss ; 
2o; 
NT (ky + ke + 2) 


Yo" RATT (hy +1) Pe + 1) 


DISTRIBUTION OF f IN 1000 SAMPLES OF 5 


Q-0. [Correlation in sampled population] 

The histogram represents the observed  distribu- 
tion in samples from a_ rectangular population. 

The solid curve is a Pearson curve fitted to 
the observations. 

The dashed curve is the theoretical distribution for 
samples from a continuous normal population. 














































































































50- - 50 
407 { p40 
- 
‘ o — = . ‘athe « . 30 
3 30 _ ThA oH sth 
a TIA m % t 40 
9 204 VA ’ - 20 
=} 4 y 
tm / 
10+ } | + 10 
0 oe 0 
Q 
a ? 


-1.05 


= 4) 
o 
' 


-0.3 
-0,2 


SS 
oS 9 
' 


-0.6 


-O7N += ASO 
Foo oOCcooCgcTdo CO 
1 

0 


ey 


Biometrika xx1Vv 2 








390 Distribution of the Correlation Coefficient in Small Samples 











*95 to 1°00 


90 ,, 95 
85 90 

| 80 ,, “85 
75 ,, “80 
70 “75 
65 ,, ‘70 

60 ,, 65 

Pe css “60 

| wo. “Dd 
45 ,. 50 
40 ,, “45 
9... “40 
30 ,, *B5 
aS » *30 
| 20 *25 
15 ,, “20 
| 10, LS 
a "10 

| ‘00 ,, ‘05 
00 ,, —°05 
—"05 ,, = *10 
-°"10 , —°15 
| —"15, — 20 
} —‘20, —°25 
| —°25 ,, 30 
-—'30 ,, -—°35 

| -°35 ,, —°40 
-"40 ,, — *45 
—"45 , -—°50 
-"50 . —'55 

— °55 ,, “60 
-°60 ,, —°65 
—°65 -:70 
-°70 75 
75, —*80 
-‘80 , —°85 

— 85 — “90 

— 90 —°95 


Totals 


| Observed | 
| frequency 


Jo 


1000 


_ 


_ 


in which N is the number of samples. For the samples of 10 the 
curve is 


-0105 
5°0417 
“6697 
"8846 
"D388 
*7364 
0431 
*6769 
*0345 
*1959 
*4609 
*3000 
“1199 
*2596 
*3099 
“2406 
“0477 
“1199 
*1333 
*0165 
“0667 
“8621 
0126 
“0431 
6°6273 

"2925 

*1538 

*6060 

*3750 

*1421 
1°3626 

*B449 
5°2966 
2°2730 
1°0000 


34°1421 


For fitted Pearson curve, y?=34'1421, n=38, P=0°648. 


For “normal theory” frequency, x2=39°5168, n=39, P=0°448. 





Goodness of Fit. Samples of 5 from Rectangular Population. (p =0.) 


“7896 
*3034 
“1662 
*2222 
“0602 
“0005 
“7504 
‘6108 
"8846 
“6203 
“8929 
“1007 
*3905 
“0003 
“4016 
*8363 
*0923 
“0051 
*0531 
1°9132 
°0453 
*2230 
“0051 
"3479 
"1418 
“2049 
1°2362 
“0865 
*1007 
6°0357 
*3546 
*1538 
‘6751 
*2462 
*0370 
"8363 
*0556 
8°7377 
*7067 
4°1925 


39°5168 


equation of the 


y = 500 x 027275 (1 + r)POOM (1 — rh O™ eee ee ones (5). 
The graph is shown in Figure 1. 
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DISTRIBUTION OF f IN 1000 SAMPLES oF 5 


The histogram represents the observed distribu- 
tion in samples from normal correlation table N; 
Q@=0.83, without Sheppard's correction. 

Q=0.901, with Sheppard's correction. 

The solid curve is the  theoretico!l distribution 
for samples from a continuous normal population 
having Q=0.8 

The dashed curve is the theoretical distribution 


for a Continuous normal population having 8-09 i426 
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Distribution of the Correlation Coefficient in Small Samples 


DISTRIBUTION OF f IN 500 SAMPLES OF 10 


The histogram represents the observed  distribu- 
tion in samples from normal correlation table N; 
@=0.63, wilhout Sheppard's correction. 
Q=0.901, with Sheppard's correction. 

The solid curve is the theoretical 
for samples from a continuous normal 
having Q-=0.8 

The dashed curve is the 


distribution 
population 


theoretical distribution 
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The corresponding equation for samples of 5 was not worked out, but it was 
noted that the value of kz was negative, yielding a curve very much like the solid 
curve of Figure 2, starting, however, at r=—1 and having an asymptote at r=1. 


Goodness of Fit. 


The computation, especially the mechanical quadrature, necessary to apply the 
x* goodness of fit test to all of the distributions did not seem warranted. However, 
the test was applied in several instances. 


The application to the distribution of r in the samples of 5 from the rectangular 
population (the case represented in Figure 3) is shown in Table III. The fitted 
Pearson curve is given by (1); the “normal theory” frequencies corresponding to 
samples of 5 from a normal population in which the correlation p is zero were 
obtained from Fisher’s curve (4). For the Pearson curve, y* = 341421, and since 
there are 40 groups and the theoretical distribution has been made to agree with 
the observed in the total and in the standard deviation n= 38, using the notation 
n+1=n’ of Elderton’s Table (Table XII of Pearson’s Tables for Statisticians and 
Biometricians, Part I), These values are beyond the range of this Table, but by 
means of Tables of the Incomplete T-function it is found that P=0°648. (R. A. Fisher's 


TABLE IV. 


Goodness of Fit. Samples of 5 from Triangular Population. (p= 05.) 


<a fy-f)? 
frequency theory” | (fo- FY 
i, frequency, f | “4 


0 


| 
| i 
| Observed a Normal 
r 
| | | 


_ | 
= 
_ 
—) 
_ 
to 
= 
2 


e.. 9 111 131°8 3°28 

- “8 123 124-2 01 

On ‘7 lll 109°0 | *O4 

D4 6 101 93°2 “65 

ee “5 103 78°8 7°43 

a ae 6 | 663 | 16 

<a ‘3 | 67 | 55°7 2-29 

: a 2 45 | 46°7 06 

0, 1 36°5 39-2 “19 

0, -'l 32°5 | 32°8 “00 

—]., =2 13 27°4 a"54 

eS 21 «| (829 16 

tt, =% 16 19°0 17 

—-4,, -°5 14 15°6 “16 

= ee ll 12°7 23 

| ae aay 6 10°1 1°66 

-7, -8 12 77 2°40 
—8 , 9 4 5e4 “19 

-*9 ,, -1°0 13 | {36 - 

| Bas aE eer 

| 
Totals 1000°0 =| 1000°0 25°51 


x2=25°51, n=18, P=0112. 
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approximate method * yields the value P = 0°654.) For the Fisher curve for samples 
of 5 from uncorrelated normal material, y* = 39°5168, n = 39 (since there are 40 classes 
and the theoretical distribution is made to agree only in the total), P =0°448. 
(Fisher’s approximate method gives the value P = 0°454.) 

The goodness of fit test for samples of 5 from the triangular population is shown 
in Table IV. The “normal theory” frequencies are those corresponding to samples 
of 5 from a normal population in which p=05 fF. 

Here xy? = 25°51, n=18(n’ =19 for use in Pearson’s Tables for Statisticians and 
Biometricians), and P = 0°112. 

For samples of 10 from the triangular population (see Table V) the value of x? 
is 46°89, n = 22, and P=0°00153, the only extremely bad fit noted. 


TABLE V, 
Goodness of Fit. Samples of 10 from Triangular Population. (p =0°5 ) 





n l r j | 
| Observed | “Normal | (f,-f)? 
r | frequency | theory” | =*%-~ 

he | frequency, f | J | 

i og ata | | 

| 95 to 1-00 3. | o7 | = 

90 ,, 95 {3 {a3 Bees | 

85, 90 11 12-9 28 | 
80 ,, 8&5 17 22°4 1°30 

1e ce 80 28 30°5 20 | 
70 , 75 27 36°8 2°61 

| 65 70 40 40°1 00 | 
60 ,, 65 46 41°] 58 

55 .. 60 54 40°2 4°74 

50 , 55 31 38°0 1°29 

45 , 50 36 34°9 03 

| 40 45 } 36 31°5 64 
35 ,, 40 38 27°9 3°66 
30 ,, Bd | 19 24°3 1-16 

25 .. *30 22 20°9 “O06 
20 ,, °25 22 17°7 1°04 

70 20 | 15 14°9 “00 | 

20 .. 15 22 12°4 7°43 

05 , 10 6 10°2 1°73 | 
00 ,, 05 2°5 8-4 4°14 

00 ,, —°05 4$°5 6°8 ‘78 

— 05 ,, ‘10 2 54 ae | 

10 ,, —*15 {; 16-3 ne 

15 20 3 ‘a | 

20 J5 {5 126 1°50 

a “30 3 2°0 

20 ,, —°35 1°5 62 

3elow *35 | 2 | 3°6 
Totals 500°0 500-0 46°89 


x’ 46°89, n= 22, P=0°00153. 
3, fo : , 
* Put t=V2y?—-V2n-1; then P | —— e-* dt approximately. 
Jt Ver 


+ See Biometrika, Vol. x1, p. 381. The frequencies were obtained from the ordinates by quadrature. 
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The distribution of r for samples of 10 from the normal correlation table (Table I) 
in which p= 0°83 before Sheppard’s correction is applied (p =0°9 after) was tested 
for goodness of fit after making Fisher's transformation* z= tanh r, = tanh p. 
The variate z is then approximately normally distributed with mean 





ta a Be 1+? 
ptt sh | +i pt | ein naneaeetanesaere (6) 
and variance 
1 4 — p* 176 —21p? — 21p* x 
“wg 
of= | 3@-1)* 48 (n—1 | seehessacene (7). 


In the present instance, it is found that 7=1:23533, ¢7= 013588. The transforma- 
tion and the x? test are worked out in Table VI. It is found that P=0024, The 
fit is not very good, but the main discrepancies are somewhat irregularly scattered 
throughout the distribution. 
TABLE VI. 
Goodness of Fit. Samples of 10 from Normal Population having High Correlation. 
Fisher's Transformation. 





| j . Normal | Observed | - rye 


r z=tanh"'r — area A 500A=/f | frequency Vo-J Y 
o, } (1l+a) | | So | 
— — es \— — j 
| 1°00 | 20 x 1-00000 eles ie A 
| -95 | 1°831 7808 +1°6182 ‘94719 poet Bs a - “a: | 
“90 1°472 2195 + 6427 | -73978 91795 108-62 99 | ona 
“85 1°256 1528 + °0565 "52253 167174 3-59 8G | “<07 
“80 1°098 6123 — °3709 64464 1 1707 R853 46 2-67 
“75 972 9551 ~ ‘7118 ‘76171 997 39-64 56 6°79 
“70 "867 3005 9985 84098 5302 26-51 93 “46 
65 775 2987 — 1°2481 89400 3534 17°67 31/77 
“60 ‘693 1472 —1:4709 ‘92934 9357 11-78 15 87 | 
“DD 618 3813 — 1°6738 95291 1573 wae 1D 6°38 
50 549 3061 -1°8612 ‘96864 1050 5 oF 4 27 | 
“45 -484 7003 — 2°0365 ‘97914 03 3:5] 3 ihe 
“40 423 6489 2°2021 98617 169 {3-35 {3 700 | 
“B35 *365 4438 — 2°3600 *99086 313 1°36 1 
"30 +309 5196 2°5117 “99399 909 1-05 
"25 *255 4128 — 2°6585 “99608 137 68 > | 
20 | +202 7326 —2°8014 | 99745 99 “46 . “03 
15 *151 1404 2°9414 "99837 60 \ 30 
“10 -100 3353 —-3°0792 “99897 38 19 
05 -050 0417 - 3°2157 “99935 25, “12 
‘00 “000 0000 3°3515 ‘99960 -00040 -20 (<0) l2 
— 05 “050 0417 
| 
| 500-00 500 23°47 


| 
' | 


Z=1°2351, o, =0°3686, = 23°47, n=12, P=0°024. 

* See R. A. Fisher, ‘‘ On the ‘probable error’ of a Coefficient of Correlation deduced from a small 
Sample,” Metron, Vol. 1. No. 4 (Sept. 1, 1921), pp. 3—32, E. 8, Pearson has given an illustration showing 
the degree of accuracy of this approximation in a very similar case (n=10, p=°8) (Biometrika, Vol. xxt, 
p. 359). 

+ Note that we have a positive and a negative value of (z —z)/c,. 
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In Table VII the means, the standard deviations, and the betas of the distribu- 
tions of r for the observed samples from the various populations are compared with 
the respective constauts* for the corresponding theoretical distributions for samples 


Comparison of Constants. 


from normal populations. 


TABLE VII. 


Comparison of Constants for Distribution of Correlation Coefficient r. 


Sampling Experiment 


1000 samples of 5 from 
rectangular population 
(p=0-0) 


1000 samples of 5 from 
triangular population 
(p=0°5) 


500 samples of 10 from 
triangular population 
(p=0°5) 


1000 samples of 5 from 
normal population 
(p=0°83 for discrete 
frequencies and p=0°9 
for continuous) 


500 samples of 10 from 
normal population 
as for samples of 5) 


The standard errors shown are all approximate. 
The standard errors of 8; and Bz were taken from 
of Karl Pearson’s Tables for Statisticians and Biometricians, Part I 
which £,=0 the approximate formula+ 


* The values corresponding to p=0°9 and p=0°8 were found directly in Biometrika, Vol. x1 (1915— 
17), pp. 399 ff. The values corresponding to p=0°83 were obtained by interpolation, 
+ See E. 8S. Pearson, “The Test of Significance for the Correlation Coefficient,” Journal of the 
American Statistical Association, Vol. xxv1 (1931), p. 18. 





| Experiment 
| Normal theory 


| Experiment 
Normal theory 
| Standard Error 


Standard Error | 


Experiment 
Normal theory 
Standard Error 


Experiment 


| Normal theory 


p=0°80 
Standard Error 

” p=0°83 

5 p=0°90 

Standard Error 


Experiment 
Normal theory 
p=0'80 
Standard Error 
» p=0'83 
» p=0'90 


Standard Error 


r 


-0°0031 


0 
0°0158 


0°4653 
0°4517 
0°0134 


0°4910 
0°4787 
0°0119 


0°7852 
0°7541 
0°0085* 
0°7873 
0°8687 
0:0055* 
0°8012 


0°7819 





| 


0°0065* | 


0*8135+ 
0°8887 
0°0037 


Co, 


0°5153 


0°5000 
0°0079 


0°4127 
0°4239 
0°0104 


0°2483 
0°2671 
0°0098 


0°2306 


0:2691 
0°0126 
0°2461 
0°1748 
0°0127 


0°1433 


0°1461 
0°0087 
0°1288 
0°0832 
0:0066 


By 


0-0000 5817 
| (/8;=0-0076) 
0 


| ae = 
| (o jg, =0°0447) 
| 


1*3455- 
1°0315 
0°1312 


0°8882 
| 0°7431 
0°2085* 


| 5*0457 


5°4065 
| 130290 
| anes 
; 
| 

3°5357 


3°1377 





3-9442 
| 3-4191 
| 0°2096 


5°4425 
3°6774 
0°4737 


9°2953 


9°7830 


21°7579 


9°4080 


8°0534 
| 9-2 
13°6667 


They were obtained as follows: 


Tables XXXVIT and XXX VIII 


og = (Ba— 682+ 9)1/NE 


In the case in 








7 
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for the standard error of V8; was used. For the other constants the formulae * 
ah ey 
a oe 5 N 
were employed. Here N is the number of samples, which in the present cases is 
either 1000 or 500. 


In most instances the deviations of the constants from their expected values in 
samples from normal populations are not improbable. The worst discrepancy seems 
to be that of 82 in the 500 samples of 10 from the triangular population. For the 
samples from the normal population having high correlation the constants seem to 
be in substantial agreement with the theoretical values for samples from a continuous 
normal population having p = 0°83, which is the value obtained from the sampled 
correlation table without Sheppard’s correction for grouping. 





II, THe Errect OF THE COARSENESS OF GROUPING. 
Description of the Populations sampled. 

To study the effect of the coarseness of grouping upon the distribution of r two 
correlation tables were constructed from tables of volumes of the normal surface+ 
for p=0°5. These are shown in Tables VIII and IX and are designated Populations A 
and B respectively. 


TABLE VIIL. 


Population A. 











2 ge Se Totals 
0 1 7 iz 32 67 64 30 201 
1 11 . 74 229 326 212 “| 64 BY 917 
7 74 346 742 727 326 67 2280 | 
| 32 229 742 | 1097 742 | 299 32 3103 | 
| 67 26 (24 742 346 74 7 2289 
64 212 326 229 74 1l 1 i 917 
30 64 67 32 7 1 0 201 
Totals} 201 917 2289 3103 | 2289 917 201 9917 | 








p=0°4602, without Sheppard’s correction. 
p=0°49008, with Sheppard’s correction. 


Population A. This is a 7 x7 cell table. The class interval is 0°80c, or 0°82¢ 
if Sheppard’s correction has been applied. The value of p as actually calculated from 
the table is 0°4602. With Sheppard’s correction (applied to the two marginal standard 


* See footnote + on p. 396. 
t See Tables for Statisticians and Biometricians, Part II, Table VIII, pp. 78—135. 
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deviations involved in p), p=0°49008, essentially the same as in the continuous 
frequency surface. 























TABLE IX. 
Population B. 
Totals 
0 9 90 170 68 337 
iG * ig 997 : 1028 940 170 j 2374 
90 | 1028 | 2273 | 1028 | 90 | 4509 
170 = 940 - 1028 227 ‘ 9 2374 
68 170 90 9. 0 337 
Totals| 337 2374 4509 2374 337 9931 


























p=0°4377, without Sheppard’s correction. 
p=0°4924, with Sheppard’s correction. 
Population B. This is a 5 x5 cell table. The class interval is 1'16¢, or 1:°23¢ 
if Sheppard’s correction has been applied. The value of p calculated from the table 
is 0'4377. With Sheppard’s correction, p = 0°4924. 


Results of Sampling. Comparison of Constants. 
Five hundred samples of 5 pairs each were drawn (by Tippett’s numbers) from 
Population A and also from Population B*. These were combined to form 250 
samples of 10 from each of the populations. 


In the case of samples of 10 from Population A, the values of r were calculated 
both without and with Sheppard’s correction. In the rather rare instances in which 
this correction led to a value of r greater than unity the value r=1 was used. 


The resulting distributions of r are shown in Table X and are compared with 
the corresponding theoretical distributions from a continuous population in Tables XI 
and XII through the values of the means, the standard deviations, and the two 
betas. There is some indication, supported by the result of the goodness of fit test 
provided in Tables XIII and XIV below, that the distribution of 7 for samples 
from a grouped normal population tends to be the same as that for samples from 


* Twenty indeterminate forms occurred in calculating r from samples from Population B. By inde- 
terminate form is meant a sample in which one of the standard deviations used in calculating r is zero 
(or in which both of them are zero). A typical example that actually occurred is 


Each sample like this was discarded and replaced by a new sample. 
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a continuous population having p equal to the value obtained from the grouped 
population without applying Sheppard’s correction. This tendency is also marked 
in the samples from the normal population having high correlation. (See Part I.) 


TABLE X. 
Distribution of r in Samples of 5 and of 10 from Populations A and B. 

















] 
Frequencies 
- Teena RancGatesl Hine s 
1 i 2 a t - 
| "95 to 1°00 14 15 1 1 
we “95 26 33 1 4 
85 .. 90 27 34 4 2 
80 85 21 28 4 8 
75 80 33 31 18 15 
70 75 28 12 14 14 
"65 ,, 70 26 19 21 15 
Ow 65 38 41 17 20 
55 ,, 60 20 15 23 26 
50 ,, 55 33°5 27 18 20 
45. 50 27°5 21 15 15 
"40 ,, 45 27 30 14 17 
‘ee 40 16 18 18 7 
"30 ,, 35 2 17 15 17 
"25 30 13°5 13 12 li 
20 .. 25 75 18 li 7 
15 ,, 20 13 1] 7 8 
10 , 15 11 10 7 | 4 
05 , 10 19 5 7 | 4 
00 ,, 05 9°5 19 7 5 
00 — 05 10°5 17 2 75 
—-*05 , —:10 3 0 5 6°5 
| -"10 , —°15 { 7 2 2 
15 — °20 7 6 1 3 
| 20 — °25 x 9 3 2 
| —"25 | 30 5 4 " 2 
30 ,, 35 3 l | 
35 ,, - 40 2 2 1 
“40 — 45 6 12 2 
—°45 0 2 l | 
— “50 — “hd 6 4 1 
55 — ‘60 3 1 - 
- "60 , 65 2 6 
“65 —°70 3 3 
+ oe 2 | 1 | 
| —"75 — +80 <a 3 | 
— °80 *R5 2 1 
| -"85 , —90 2 
90, —°95 2 | aX | 
-— "95 ,, —1°00 2 | 
| 
| Totals 500 | 500 | 250 | 250 
| 


A; and A,, refer to samples of 5 and of 10 respectively from Population 4. 
B; and B,, refer to samples of 5 and of 10 respectively from Population B, 
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TABLE XI. 
Constants of Distribution of r for 500 Samples of 5. 














Population | T | o | By | Be 
| 
Experimental : | | 
A (7x7 cells), p=0°4602, *p,=0 49008 | 0°42305 | 0-4255+ | 1-1281 | 3°5818 
B (5x5 cells), p=0°4377, *p,=0°4924 0-4224 | 0°4279 | 0°8340 | 32480 
Theoretical : | } 
Continuous distribution, t p=0°5 0°4517 0°4239 1°0315 | 34191 
- - p=0-4 0°3584 0°4528 0°5909 2°8097 
pa ~ p =0°4602 0°4143 0°4363 | —- | —_ 
> si p=0°4377 0°3933 0°4429 — | — 





TABLE XII. 
Constants of Distribution of r for 250 Samples of 10. 














| 
Population 7? o B, Bo 
ev eM ACTOR Baked, Laie Some! f 
Experimental : 
A (7x7 cells), p=0°4602, * p,=0°49008 0°4440 0°2662 0°4400 | 3°0453 
B (5x5 cells), p=0°4377, *p.=0°4924 0°4431 0-2794 | 0°5583 | 32786 
A (Sheppard’s correction applied to 
each sample)... ae ... | 0°4780 | 0°2855+ | 0°5289 | 3°1995+ 
Theoretical : | 
Continuous distribution, + p=0°5 0°4787 0°2671 | O°7431 | 3°6774 
” ” p=04 0°3813 0°2917 0°4374 | 3°1669 
‘9 a p=0°4602 0°4398 | 02777 0-606 3-45- | 
8 se p=0°4377 0-4179 | 0-283 0538 333 











* p, is the value of the correlation coefficient when Sheppard’s correction has been applied to the 


marginal standard deviations of the correlation table. 
+ The values of the constants of the continuous distribution are independent of the number of samples, 
although dependent on the number in the sample. 


Fisher's transformation z= tanh r, = tanh p, used in Part I, was also used 
here to obtain theoretical frequencies. 

For Population A: {= tanh~*0°4602 = 0°497565-, 7=0°523562 by formula (6), 
o, = 0'373324 by formula (7); the goodness of fit test, worked out in Table XIII, 


gives P=0°9503. For Population B: €=tanh-'0-4377 = 0°469382, 2 = 0°494101, 
o,=0°373515-, P=0°502. (See Table XIV.) 


SUMMARY AND CONCLUSIONS. 


Actual samples of 5 and of 10 were drawn from bivariate populations differing 
greatly from the normal. Population correlations p=0 and p=9°5 were used. The 
distributions of r were not essentially different from the theoretical distributions in 
samples from a normal population which may be considered in fact as providing 
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TABLE XIII. 
Goodness of Fit. Samples of 10 from Population A. Fisher's Transformation. 






























































<= Normal Observed (fe-f i? 
r z=tanh-'r case area A 250A=f | frequency | ~°— ’ 
o, 4(1+a) fe J 
1-00 oa) @ 1-00000 eee 3 | 
‘95 | 1:831 7808 35042 99977 | “OOORs at on 
90 | 1-472 2195 2-541 ‘99447 1988 la-s3 | 4 
‘85 | 1-256 1528 1-9623 ‘97515 oon a i 206 
"80 1-098 6123 1°5404 “93827 pe oat 1-80 
“75 ‘9729551 | 1°2038 | -88570 aaa me | 4 26 
70 ‘867 3005 | 9208 | -82147 =) we) a” 
‘ r | | ‘ aR > | ei 
Es aoc: | ae Pty ba ol 7150 | 17°88 | 2 54 
65 | 775 2987 | 6743 | 74997 7489 18°72 17 16 
“60 693 1472 | 4543 | -67508 sae. | ~amewa Tt 93 : 
55 | 6183813 |  -2540 | -60025 483 | lev | ff = | 
“ co | toa 7274 | 1818 | 18 -00 
‘50 | 5493061 | 0690 | -59751 aut | wae | : aa 7 
“45 4847003 | — 1041 | “52141 ooo BES 6. at BR. = 
; len be RR Be. 6409 | 16-02 14 25 
40 4236489 | — -2676 | -60550 sess | 1aen | we an 
*35 "365 4438 | — -4235+ | -66403 save | 319 |. 15 95 
“30 3095196 | — °5733 | -71679 wes | tien 2 on 
“25 2554128 | — -7183 | -76362 4121 10°30 | il 05 
20 -202 7326 | — -8594 | -80483 ao og “ Me 
15 1511404 | — -9976 | -84074 nn | 770 . a 
“10 "100 3353 | —1°:13387 | °87154 2606 6°52 cf 04 
05 | 0500417 | -1:2684 | -89760 pone pend . a | 
-00 0000000 | -1-4024 | -91962 | 33,7 rte ‘ 4 
~-05 | —-0500417 | -1:5365- | -98779 | gigas {371 {5 19 
--10 | —-1003353 | -1-6712 | -95964 ee 
—"15 4 
—°20 : 
— +25 + 04736 11°84 1; 68 
~ +30 | 
— "35 
— “40 8 
A nw aan ae ee . 
| | | | 
| | 250-00 | 
| | 


250 | 10°09 


= °523,562, o,=°373,324, y7=10°09, n=19, P=0°9503. 
* Note that we have a positive and a negative value of (z —7)/c,. 


good first approximations*. If this is true for samples containing so few items as 
5 or 10, it will assuredly hold true for larger samples. 


The values of r in actual samples (n =5 and n=10) from a normal population 
having high correlation seem to follow the theoretical distribution curve. 


* Cf. E.S. Pearson, ‘‘Some Notes on sampling Tests with two Variables,’’ Biometrika, Vol. xx1® (1929), 
pp. 337—360. In commenting on certain of his experiments, Pearson says, ‘‘ These two series of results 
are of considerable interest and suggest that the normal bivariate surface can be mutilated and distorted 
to a remarkable degree without affecting the frequency distribution of r in samples as small as 20.” 
(p. 357.) 

See G. A. Baker, ‘‘The Significance of the Product-Moment Coefficient of Correlation with special 
reference to the Character of the Marginal Distributions,” Journal of the American Statistical Association, 
Vol. xxv (1930), pp. 387—396. See also E. 8. Pearson, “The Test of Significance for the Correlation 
Coefficient,” Journal of the American Statistical Association, Vol. xxv1 (1931), pp. 128—134. Read E. 8. 
Pearson’s comments on Baker’s results. [The conclusion above seems not wholly consistent with the 
P’s of Tables V and VI. Ep.] 
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Goodness of Fit. Samples of 10 from Population B. Fisher's Transformation. 





TABLE XIV. 











= Normal 
r z=tanh-r ms area 
o, 4(1+a) 
1°00 a)  @ 100000 
“95 1°831 7808 3°5813 “99983 
“90 1°472 2195 2°6187 “99590 
*85 1°256 1528 2-0402 “97932 
“80 1°098 6123 1°6184 *94721 
“75 *972 9551 1°2820 “90008 
“70 *867 3005 *9992 *84110 
*65 *775 2987 *7528 *77421 
“60 “693 1472 *5329 *70298 
"D5 “618 3813 *3327 *63032 
*50 *549 3061 *1478 "55875 
*45 *484 7003 — °0252 “51005 
“40 *423 6489 — *1886 *57480 
*B5 *365 4438 — "3445 "63474 
*30 “309 5196 — *4942 *68934 
"25 *255 4128 — *6390 *73859 
*20 *202 7326 — ‘7801 *78233 
*15 “151 1404 — °9182 *82074 
‘10 *100 3353 — 1°0542 *85410 
“05 “050 0417 — 1°1889 *88278 
‘00 *000 0000 — 1°3228 “90708 
— °05 — ‘050 0417 — 1°4568 "92744 
-'10 — °100 3353 — 1°5915- "94425 
-°15 — ‘151 1404 — 1°7275- “95796 
— °20 — *202 7326 — 1°8656 “96895 
—°25 
— °30 
— °35 
—°40 
—°*45 
— °50 














A 


“00017 
393 
1658 
3211 
4713 
5898 
6689 
7123 
7266 
7157 
6880* 
6475 
5994 
5460 
4925 
4374 
3841 
3336 
2868 
2430 
2036 
1681 
1371 
‘01099 


“03105 








250A=f 





249-98 





Observed 





frequency (fo-F)* 
0 f 
1 
{s “66 
2 
8 ‘Ol 
15 *88 
14 37 
15 18 
20 27 
26 3°38 
20 *25 
15 *28 
17 “04 
7 4°25 
17 *82 
17 1°78 
7 1°42 
8 27 
4 2°26 
5 “66 
7°5 *33 
6°5 “04 
(2 
3 1°10 
le 
2 
| 1 
| 4 07 
Lz 
250 19°32 








Samples of 5 and of 10 were taken from coarsely grouped normal correlation tables. 
The value of p in these tables was 0°5 if Sheppard’s correction was applied, without 
this correction it was somewhat less, The distribution of r in these samples seems 
to be essentially the same as the theoretical distribution of r in samples from 
a continuous population in which the value of p is that obtained from the correlation 
table without applying Sheppard’s correction. This might be taken as indicating 
that Sheppard’s correction should be applied to the sample +, although this appears 
only partially to make the distribution the same as that for samples from a con- 
tinuous population. It is undoubtedly better to avoid coarse grouping. 


Z=494,101, o,= 


* 


‘373,515, x2=25°22, n=20, P=0°502. 


Note that we have a positive and a negative value of (z -)/o,. 


+ See R. A. Fisher, Statistical Methods for Research Workers, p. 152. 
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THE PERCENTAGE LIMITS FOR THE DISTRIBUTION 
OF RANGE IN SAMPLES FROM A NORMAL 
POPULATION. (n<100.) 


By EGON 8S. PEARSON, D.Sc. 


THE Table A given on p. 416 below represents an attempt to summarise in most 
convenient form for practical use the recent work on the distribution of range in 
samples from a normal population*. It deals only with the case of samples of 100 
or less. The accompanying discussion may be divided into three parts: 


(1) The method of computation of Table A. (See p. 416.) 
(2) Experimental checks on the adequacy of the approximation involved. 


(3) Illustrations of the use of Table A. 


In addition to the value of mean and standard deviation, of which the former 
has already been completely and the latter partially tabled, the present Table gives 
certain percentage limits, namely the values of the range which will (a) not be 
attained, and (b) be exceeded, in 0°57, i # 5%, and 10 ya of random samples. 
The unit is throughout the population standard deviation. 


(1) The Method of Computation of Table A. 


It has been assumed that the sampling distribution of range may be adequately 
represented by Pearson curves with the appropriate moment-coefficients. That this 
assumption is not unreasonable will be seen from the experimental results presented 
below, but since it has had to be made, no very high degree of accuracy is justified 
in calculating the percentage limits from these curves. Nor indeed is a high degree 
of accuracy required for practical purposes. The procedure adopted may be sum- 
marised as follows : 


(a) A framework was first obtained by finding the equations of the Type I 
and Type VI curves, using the appropriate frequency constants (set out in Table VITI 
of my paper referred to above), for samples of size 

n=3, 4, 6, 10, 20, 60, 100. 


The first four of these curves were made to start at the point, range = w = 0, and 
given the correct’ mean (= %), standard deviation (= ¢,,), and §,. For the curves at 
n = 20, 60 and 100 the start was not fixed and the first four theoretical moment 


* L. H. C. Tippett, Biometrika, Vol. xvu. pp. 364—387. E. S. Pearson, Biometrika, Vol. xvi. 
pp. 173—194. ‘‘Student,” Biometrika, Vol. xtx. pp. 151—164. Tables for Statisticians and Biometricians, 
Part II, pp. ex—cxix. 
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coefficients were used—7%, a», 8; and fz. The curves given by “Student” * were all 
calculated by the second method; since however the distribution of range is abrupt 
at the lower end when n is small, it seemed probable that a better representation 
of the true but unknown sampling distribution would be obtained in these cases 
by making the approximative curve have the correct start. By the time n = 20 it 


was considered of greater importance to use the correct 82 rather than the correct 
start. 


For the case n = 10 the percentage limits were however found both from the 
fixed start and the 4-moment curve, with the following results. 








TABLE I. 
Lower Upper 
Per cent. Limits 05° , Ae Se 10°/, 3°/, ii OS 7. 
Fixed start Curve 1°35 1:48 1°86 2°09 413 448 5°15 5°40 
4-moment Curve 1°32 1°46 1°86 =2°10 4°13 4:47 5-16 5°42 


The figures provide some idea of the order of uncertainty involved in the method 
of approximation used. The addition of a 3rd decimal place in the limits would 
clearly be meaningless, but the retention of the 2nd decimal appears worth while +. 


(b) For each of these framework curves the position of the ordinate at the 
lower and upper tails cutting off 05°, 10%, 50°%, and 100% of the total 
frequency was found by quadrature and backward interpolation. If w, represents 
the range value corresponding to any one of these ordinates, then the quantities 

hy Eg — Dy seen esc estcsecavenseneneennctines (1) 
were calculated. For a given per cent. limit, p, the value of J, will change with n, 
that is to say with the 8; and #2 or shape of the sampling curve. But the change 


is not very rapid, and it was found possible to interpolate in the framework so as 
to find with the desired accuracy each of the 8 values of l, for 


n=3, 4, 5, ... 29, 30, 35, 40, ... 95, 100%. 


(c) Having calculated the l,’s, it was only necessary to invert the formula (1), 
and obtain w, from 
Wy =Wt yoy ...06 bss indie caeia oneal (2). 


The complete set of values of # was given by Tippett, but o, had only been 
computed for n = 2, 3, 4, 5, 6, 10, 20, 60 and 100. Three additional values were 
therefore computed at n= 30, 45 and 75 by the same process of cubature as that 
employed by Tippett, with the following result : 

Sample Size n 30 45 75 
Standard Deviation of Range o, ‘6927 ‘6601 ‘6237 


* Loc. cit. p. 163. 


+ The Table on p. 162 of ‘“‘Student’s”’ paper referred to above gives the limits to the 1st decimal 
place only. 

t A graphical method of interpolation was used. A similar process was followed in finding the 1 °/ 
and 5°/, limits for VB, and 8,; see Biometrika, Vol. xxm. p. 247. 
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TABLE IL. 


Comparison of Observed and Theoretical Frequency Distributions. 





n=10 
Obser- 
vation | Theory 
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| 


1000 | 1000-0 
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6 8-0 
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15 14°5 
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20 22°8 
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38 31°8 
32 36°2 
41 40-2 
419 43°7 
50 46°5 
46 48-6 
48 49-9 
49 50-4 
51 | 50:1 
48 | 49-0 
44 47°3 
56 45-1 
30 42-4 
44 39°3 
48 36:1 
28 32°7 
28 29°3 
25 26-0 
20 | 22°8 
15 | 19:9 
17 | 171 
13 | 146 
15 | 12:3 
13 | 10°3 
10 85 
3 7°0 
7 57 
7 4:6) 
6 3-7f 
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1 2°3> 
1 1-8) 
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R 4 1°] 
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From this framework the intermediate values of o,, shown in Table A were readily 
obtained, with an error which it is believed should not be greater than a unit in 
the 3rd decimal place. Finally the limits, w,, were obtained from equation (2) and 
are given in the Table*. How close these approximations are to the true values 
we cannot at present tell, but the results described in the following section suggest 
that they are not seriously in error. 


(2) Eaperimental Checks on the Adequacy of the Approximation involved. 

Tippett has given three experimental sampling distributions for n = 5, 10 and 
20, and has fitted theoretical curves to the last two. But he placed little weight on 
the values of 8; and §; used+; since then improved values have been suggested 
and it is on these that the present Table A is based. As I had available seven 
further series of samples, use was made of these for a fresh comparison. The series 
consisted of 1000 random samples of sizes n = 3, 4, 5, 7, 10, 15 and 20 drawn from 
a normal population}. The seven series were completely independent, but a further 
series of 250 samples of 60 was obtained by combining together the samples of 15 
in groups of 4. The observed and theoretical frequencies are compared in Table II, 
for n = 3, 4, 10, 20 and 60. The result of applying the test for goodness of fit is 
shown below in Table III. In calculating y? small groups were combined at the 
tails of the distributions so that none of the theoretical frequency groups contain 
a frequency of less than 5. The brackets in Table II show the groupings used. 


TABLE IIL. 














Sample size 3 4 | 10 20 60 
| | 
} x? | 43°00 34°99 | 26°28 | 28°10 | 15°93 
| n’ | 39 40 | 38 | 35 | 23 
"266 | 653 “905 “752 | “819 





For the cases n = 5,7 and 15, for which the theoretical curves of the framework 
were not available, only the expected and observed frequencies lying outside the 
eight percentage limits given in Table A are shown (Table IV). 

The least satisfactory agreement occurs when n = 3, where the curve appears to 
allow for too few cases at the two extremes, particularly at the lower limit. But 
apart from this there seems little evidence of any systematic differences, and taken 
as a whole the results encourage a reasoned confidence in the use of the percentage 
limits in Table A. 

* For the case n=2, the distribution ¢f w is the half of a normal curve whose standard deviation 
(if complete) is 2. The limits were found from Sheppard’s Tables. 

+ Loe. cit. p. 373. Reasons for modifying Tippett’s values of 8, and 8, were discussed by the present 
writer in the paper referred to above. 

t The samples of 3 were provided by Dr J. F. Tocher and those of 20 by Professor T. Hojo; the 
remainder were drawn for me by Mr A. E. Stone. I take this opportunity of thanking them all heartily 
for their assistance. In all cases the sampling was carried out with the aid of Tippett’s Random 


Sampling Numbers (Tracts for Computers, No. xv). The group breadth was , the population standard 
deviation. 


26—2 
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TABLE IV. 
Frequencies outside °/, Limits (n= 5, 7 and 15). 


| 


Lower Limits 





Upper Limits 


| 
| 
a PEPER Pek A ae 
0°5°,| 1°, | 5%, | 10°%| 10%, | 5%, | 1%, | 05%, 

| 

| 

| 

| 


| 
Expected 5 10 | 50 | 100 | 100 | 50 lw |} 5 | 
n= 5 3 10 55 | 97 82 42 | 12 5 | 
Observed {n= 7 6 12 45 | 102 88 | 53 | 14 | 10 
n=15 6 12 59 100 | 100 | 51 12 | 6 





(3) Illustrations of the Use of Table A. 

In applying statistical analysis to test the probability of a given hypothesis, there 
will often be more than one method of procedure and more than one criterion 
which may be used. Thus in testing on a sample or samples some hypothesis con- 
cerning variation in the sampled population, we may use among possible criteria 
either the standard deviation or the range. In so far as we know the sampling 
distribution in both cases, either criterion is of equal value in controlling the risk 
of rejecting the hypothesis tested when it is true. But in general when dealing 
with normally distributed variables, tests on the standard deviation will be more 
efficient in preventing the acceptance of a hypothesis which is false, than those 
based upon the range. It must also be remembered that the theory assumes random 
sampling from a homogeneous population, and a single anomalous individual is 
more likely to throw out a result based upon range than one based upon standard 
deviation. In certain tests however the range criteria are at less disadvantage than 
in others, and because of their simplicity in application, if employed with judgment, 
they will often prove to be extremely useful tools. 


Example 1. In order to determine whether a given “lot” of a certain material 
is up to specification, it may be necessary to consider not only the average value 
of some character measured on each article, but also its standard deviation, o. If 
the lot be large, perhaps composed of several hundred articles, it is a common 
commercial practice to estimate its nature by sampling. Suppose that it is wished 
to fix a rejection limit for the variation permissible in the sample in such a way 
that we are unlikely to reject a lot for which o <a, and unlikely to accept it if 
a >ka, where a and k(> 1) are to be given some appropriate values depending on 
the quality of the material which we are prepared to accept. Let us define “unlikely” 
as corresponding to a 1 in 100 chance. What size of sample will then be necessary 
to ensure this result if we use (a) the sample range, and (6) the sample standard 
deviation, to provide the estimate of variation ?* 


* We shall suppose that evidence is available that the character is distributed approximately 


normally. Further that the size of the sample is small compared to the size of the lot, so that the sample 
may be considered as drawn from an “infinite population.” 


| 
| 





Egon S. PEARSON 409 


(a) Using Range. Let 1(n,°01) be the lower and /(n,-99) the upper 1% limit 
obtained from Table A. Our rule will be to reject the lot when the sample range, 
W, is >wWo. To determine wo we must find n so that 


U(n, 99) x @ = we=l1 (n, 01) X hea... ...... cece eee 


The first equality will result in the long run in our rejecting a lot for which 
o <a at most 1 time in 100; the second, in our accepting a lot for which o >ka at 
most | time in 100. Suppose now that k=2. An examination of the Table shows 
that for n= 40, 1 (n, -99)/l (n, 01) = 6-09/2°97 = 2:05; and for n =45, the ratio is 
6°16/3:09 = 1:99. The desired size of sample is therefore about 44, and we = 6-15a. 


(b) Using Standard Deviation. Let s be the sample standard deviation, and 
8 the limiting value. The upper and lower 1 limits for s may be found from the 
tables of the y* integral, where y* = ns*/o*, and in Elderton’s notation n’=n*. 
Using a similar notation to that of the range problem above, we must make so 
satisfy the relation 
a® ka? ; 
x" (n, 99) x —" So" = x? (n, 01) x yy iteeeesssseeeesss (4). 
Again taking k= 2 for purposes of illustration, it will be found that for n = 24, 
x* (n, °99)/x* (n, 01) = 4°08 and for n = 25 the ratio is 3-96. Consequently the con- 
dition will be satisfied for a sample of 25, and so? = 1-72a*. Method (b) has therefore 
a clear advantage over method (a), for the gain in time in computation following 
the use of range would hardly balance the loss involved in measuring 20 additional 
articles. 


Example 2. The use of range is however of greater value if a sample is divided 
into small groups. Suppose that a sample of NV is broken up in a random manner 
into m sub-samples each containing n observations, so that V = mn. Let wy, wa, ... Wm 
be the observed ranges of these sub-samples, and 


O05 EE ae cee WE IME. sneteciaescokencceussnaanen (5) 
be their mean value. We know that the expected values of the mean range and 
standard deviation of range in repeated samples of n are 

W=Uns, Sw=dac ...... Laka Re SRR SERRE (6), 


where a, and 6, are given in the second and third columns of Table A. It follows 
that we may use as an estimate of the population standard deviation, o, 


o2= WwW Cy ccccccccccccecececccescecsesssesssesees (7), 
and that this estimate will have a standard error (S. E.) 


1 one. ae ae 
S.E. of gg = — (S.E. of @) = we ibe SG cc hase ene votcotuiee (8). 
Un Gn vm 
* The 1°/, limits may be obtained directly without interpolation from the x* tables in R. A. Fisher’s 
Statistical Methods for Research Workers; then of these tables is the number of degrees of freedom =sample 
size — 1, 
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Let us compare the reliability of this estimate with that obtained from the 
standard deviation, s, of the whole NV = mn observations*. In this case we know 
that if § is the mean and o, the standard error of s in repeated samples, then 


8=Cyo, o,=dyo vbseweesseeuéeescevestoeece (9), 


where the values of cy and dy for small samples have been tabled for the case of a 
normal population +, and tend rapidly to 1 and 1/V2N, respectively, as N increases. 
Thus we may use as an estimate of o 


Gi S Of Ore sco cecates ccecaeeeseieenpaveceinees (10), 


which will have as its standard error 


where 06y—>Las N>~. 


A comparison of the reliability of these two methods of estimating o is obtained 
from the ratio 


~ . rr. 
Standard Error of cg 1 baV2n _ dn 
Standard Error of o, Oy dy _, aon 


The following are a few values of Oy: 


{ N 5 10 20 30 40 50 100 
| @y 1148 1-068 1-033 1021 1016 1013 1006 
By taking @y =1 we shall slightly underestimate the reliability of o2 (compared 
to o;), but the correction can be made if desired. A series of values of ¢, = V2n bn/Gn 
is given in Table V. It is seen that the range method of estimation may be used 
to best advantage by breaking up the observations into equal random sub-samples 
of from 6 to 10 individuals. If this is done, the standard error of o,= Mean range/a, 
will be approximately 1°15 times the standard error of o; = sf. 


TABLE V. 











| | 
n | oy mn | n | Pn | 
| —_ - - 
2 1511 12 1°169 30 | 1-314 
3 1-286 13 1176 | 35 | 1°352 
4 1-209 14 1°183 40 | 1°387 | 
5 1175 15 1°191 45 1°418 
6 1°159 16 1+200 50 | 1-449 | 
7 1°153 17 1-208 60 | 1:509 | 
8 1°152 18 1°216 70 | 1-563 
9 1-154 19 1°225 80 | 1°613 
10 1°158 20 1°234 90 1°662 
: a 1-163 25 | 1-275 | 100 1°706 


* 


Nothing would be gained when using s by dividing the N observations into groups. 
+ Biometrika, Vol, x. p. 529. Tables for Statisticians and Biometricians, Part II, Table xvi. 
t+ This is omitting the correcting factors cy and @,. 
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The table may also be used to compare the reliabilities of different forms of 
estimate of o made from range. For example, NV observations may be used 


(a) as m groups of n, 








: @ bn oc o 
Estimate, oe =—; SE=— —= : ‘ 
ines mY ky mene | 
(6) as one group, so that m=1, 
: w b o 
Estimate, ae ee a eer eee vowel 80) 
ma o2 -- ve a= dy JON (14) 


The ratio of the two standard errors of estimate is therefore ¢,/¢y. For example, 
the advantage of breaking a sample of 100 into 10 samples of 10 is clear, for 
$10/$100 = 1°158/1-706 = 0°679, or by using oe’ rather than o_”’ we obtain a reduction 
in standard error of about 32 %/. 

There is also another advantage. The sampling distribution of range is asym- 
metrical but approaches most nearly to the normal when x lies between 6 and 10*; 
if 8, and f, refer to the sampling distribution of w, then the coefficients for the 
sampling distribution of @,the mean range in m samples (and therefore the coefficients 
for c2= @/a,), will be By = 8;/m and B,=3 + (B82 —3)/m. Consequently the estimate 
¢2 Will not only have the lowest standard error when n has a value of 6 to 10, but 
will also be more nearly normally distributed than if » were larger. 

Example 3. Reliability in estimation is closely associated with efficiency in 
discrimination. The following figures represent 40 random variates obtained from 
sampling a normal distribution with mean = 51 and standard deviation = 10 units. 
Could we be sure that they were not drawn from a distribution in which o = 5 units? 

48, 54, 41, 53,49; 51, 44, 34, 62, 54; 59, 39, 45, 57, 49; 
44, 50, 57, 37,50; 62, 57, 51, 59,54; 35, 49, 36, 63, 46; 
53, 41, 47, 39,59; 54, 44, 61, 63, 44. 

If the sample is treated as a whole, the lowest and highest variates are found to 
be 34 and 63, giving a range of 29. For a sample of n= 40 from a distribution with 
oa =5, Table A shows that the upper 1// limit of range is 6:09 x 5= 30°45. The 
observed range is slightly less than this, so that the sample would be judged ex- 
ceptional but not clearly impossible if « were equal to 5. Suppose now the numbers 
are broken up into 8 consecutive groups of 5 as shown by the semi-colons ; the 8 
ranges are now 13, 28, 20, 20, 11, 28, 20, 19. The upper 05 %, limit for samples of 
5 from a distribution with o= 5 is now 485 x 5= 24°25, and 2 out of the 8 ranges 
exceed this value. This alone suggests that the hypothesis, o = 5, is unlikely, but 
we may obtain more convincing evidence by comparing the mean range for the 
8 samples, @ = 19°87, with the expected mean and its standard error for o = 5. 

We have { W =A, X o = 23259 x 5= 11-63, 

1 CZ =Cy|Vm =b,o/Vm = 8641 x 5/V8 = 1-53. 
Since (% — W)/og =5°5, there can now be little doubt whatever that the standard 
deviation in the sampled population must have been greater than 5 units. 


* Biometrika, Vol. xvi. p. 191. Tables for Statisticians and Biometricians, Part I, p. exvii. 
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Example 4, The figures given in Table VI represent the breaking-strength 
under tension in lbs. of small briquettes of cement-mortar*. Mixings of the material 
were carried out on each of 10 different days, and 6 briquettes formed from each 
mixing. The problem is to determine whether these 10 groups (or samples) of i 

6 differ significantly either in mean strength or in variability. Other investigations 
have suggested that the variation in strength within a homogeneous sample is near 
enough to normal to justify the use of normal theory tests. 


-_ 


TABLE VI. 
Breakiny-Strength of Cement-Mortar Briquettes. 
Group Number Ms 1 2 3 } 4 5 | 6 7 8 


Values of 
breaking-strength 
in lbs. 





Ls 
o 
© 


uw 
fo) 


| 
530 | 
| 
| 





Mean | 383°3 | 508-0 | 505°0 | 582°7| 557-7 | 337-0 | 514-3 





Variance AMS 
Standard Deviation 
Range 


| 930°6 


| 


30°5 


95 





| 


| 


1733°3 
41°6 
128 





1558-3 | 1032-6 
39°5| 32-1 | 
100 | 91 


748°6 
27°4 | 


68 


588-0 


24-2 | 


65 


| 


2372°6 
48°7 | 
148 




















In the first place there are two hypotheses to be tested: Hy that the group 
variances, and H; that the group means do not differ significantly. Let us apply to 
the problem in turn statistical tests of increasing refinement. 
Denote by %, s and w; the mean, standard deviation and range of the tth group; 
by & the number of groups (=10); by WV the total number of observations (= 60); 
by m the number in the ¢th group (=6); and by 9 and s» the mean and standard 
deviation of the NV observations. <q 


(a) Assume hypothesis Hy to be true and test H,. Crude Method using Range. 
The mean of the 10 values of w is found to be # = 83°7 lbs. But for repeated samples 


of 6 we see from Table A that #=2°53441 x o; consequently we may obtain a 
rough estimate of the assumed common group standard deviation, namely 
Og = 83°77 /2°53441 = 330 Ibs. .........ccecceeseeesees (15). 

The lowest of the ten group-means is the 6th (337°0 lbs.), and the highest is the 
10th (753°0 lbs.); this gives a range of 4160 lbs. If the means differed only through \ 
chance fluctuations, they would vary with a standard error of o/V6, which using the } 
estimate og = 33'0 lbs. becomes 13°5 Ibs. The observed range among the ten means 

* Iam indebted to Mr B. H. Wilson of the Building Research Station for permission to use these { 


data. The cement-mortar used on different days was obtained from different sources so that a difference 
in mean strength was to be expected. 





( 
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is 416-0/13°5 = 30°8 times this standard error. It is almost inconceivable that such 
a ratio has occurred through chance, and we may therefore conciude that hypothesis 
H, cannot be true, or that the mean strength differs very significantly from group 
to group*. 

It should be noted that when lack of homogeneity has been established either 
by the range test or otherwise, we may next question whether this is due to the 
presence in the series of one or more anomalous individual (or group) values. For 
this purpose Irwin’s tables of the probability integral of the distances between the 
Ist and 2nd, and between the 2nd and 3rd, individuals in samples from a normal 
population will be of assistance +. 


(b) Assume Hy true, and test H,. Exact method. In this case we avoid the use 
of range and obtain an estimate of o* from the group variances s,*, namely 


» 
oZ= > (n,s82)/(N —k) = 1243°43, or o, = 35°26 Ibs. ......... (16). 
t=1 


Actually we may avoid the labour of calculating the separate values of s? by 
using the identity 


k k 
Sj (1,84) = NaF — & 194(Gp— ey ..eeeeeceeececessveee (17). 
t=1 f=1 
It is found that 
k 
My? = om (Fe — He Nagt) = BBS .........00ceeceeeeed (18) 


and ?s clearly significant, again showing that hypothesis Hg is untenable f. 


(c) To test Hy. Method using Range. We now wish to determine whether 
the variation of breaking-strength within each group of 6 changes from group 
to group more than might be expected through chance. Using the estimate, 
a2 = 33°0 lbs., based upon the mean range we may ask whether the ten group-ranges 
given at the bottom of Table VI differ significantly. The position can be studied 
in the upper diagram of Fig. 1, which shows the ten values of range represented 
by black circles, and also the lower and upper percentage limits obtained by 
multiplying by 33:0 the figures taken from the row n=6 of Table A. The small 
figures above the circles refer to the corresponding group numbers of Table VI. It 
will be seen that 3 ranges out of 10 lie beyond the two 5 °/ limits; the expectation 
is 1 out of 10. 


The standard deviation of range for samples of 6 is ‘8480 x a; using 2 as the 
estimate of co, we find 
G_e=S30 KX SISO = WSO TGS. ons. c ca csccngecsnctenes (19). 


* This result is of course obvious from a mere inspection of the figures, but the example illustrates 
the method of attack. Table A shows us that the range among ten means should only exceed 5°40 x standard 
error through chance on 5 occasions in 1000; the experiment gives a factor of 308! 

+ Biometrika, Vol. xvi. pp. 238—250. Tables for Statisticians and Biometricians, Part II, pp. cev—ex 
and Tables x1x and xx. 

~ For N=60, k=10 we should expect in samples were H, true, 7*=°1525, oy:= 0651. (See Woo’s 
Tables, Tables for Statisticians and Biometricians, Part II, p. 17.) 
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The observed standard deviation of the 10 ranges is larger than this, namely 
35°7 Ibs. 

These results suggest that a more critical analysis is desirable as the variation 
in group range may perhaps be significant. 


(d) Improvement on (c). If test (b) has been used in examining the means, 
the estimate o; = 35°26 lbs. is preferable to 72: = 33°0 lbs., and should be used in (c). 
In the present instance, however, the change will scarcely affect the position. 


(e) To test Hy. Method using Variance. If it is decided to calculate the 
individual group variances, s,7, we may form a diagram showing the position of the 
10 values with regard to the percentage limits in exactly the same manner as for 
the range. This is shown in the lower half of Fig. 1. For samples of n from a 
normal distribution, s* is distributed according to the law 












































‘ n—-3 ns 
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Y=Y (53) MTT wt. castchadedasainaeaouten (20). 
O°; 
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* 4e e eee e 7 
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NE HX Ze Ps x x xf 
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. 2 t eee e oe e 
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Seale of Variance in (ibs)* 
Fig. 1. 


The limits may be found either from the Tables of the Incomplete Gamma 
Function, or from those of x? by writing s? = y*a?/n. A comparison of the two diagrams 
shows the small difference in the relative position of the sample points with regard to 
their scales. Range and variance are in fact highly correlated in small samples, 
and in the present case the analysis of range, which is far the more rapidly carried 
out, is probably as useful for the purpose as the analysis of variance. 


Both tests as described provide a picture which is of value in forming a judgment 
on the situation, but neither can lead to an exact measure of probability. For in 
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each case we must substitute for the unknown o an estimate (whether o; or oc) 
depending on the individual samples. This disadvantage is overcome in the follow- 


ing test (/). 


(f) To test Hy. Use of the Criterion of Likelihood. In a paper recently 
published elsewhere *, Dr J. Neyman and the writer have discussed a test developed 
from the principle of likelihood. It will perhaps be of interest to conclude by stating 
briefly the result which it leads to in the present problem. The criterion suggested 
may be defined as 


2 _) 2° aaa 
L= Mn = Jit (s2y/se Sohn Saha toes eae ee ee (21), 


» 
where See Ty GAPE si cccicenenraaeeel (22). 
t=1 


I is therefore the ratio of the weighted geometric mean to the weighted 
arithmetic mean of the group variances, and is independent of the unknown co. 
When the variation is normal, the moments of the sampling distribution of LZ (if 
hypothesis Hy be true) have been found. In the simple case in which the groups 
contain the same number of individuals, i.e. when n= n=... =n = N/k=n, say, 
the pth moment coefficient of L about zero is 


,/n—1 \F (_(N-k 
vo tes) 24 
* 


| ae =) N-k 
\ r( 2, r( 2 +p) 
Reasons are given in the paper referred to for believing that ‘he distribution 
of I may be represented approximately in many cases by a Type I curve of form 
tr EE — BPE on ccccecnsinswentatonerbad (24) 





ee ee (23). 





with the correct mean and standard deviation. That is to say, m, and mg are to be 
determined from the first two moment coefficients in (23) as follows 
’ ’ , ’ 2 
My = fy (Ma — Me )/(Me — #1”) } 25), 
=(] . ° are Po eee (25). 
Me = (1 — pr’) (ur’ — pra’ )/(pa’ — pr?) 
The hypothesis H, becomes less and less likely as L->0, and the chance of 
obtaining Z < LZ, may be found by any method giving the values of the Incomplete 
B-Function. 
For the present example, in which V = 60, k= 10, n=6, it is found from (21) 
that 
BOE iakchccaesccccvcsevesnccgeneeeeeaaed (26). 
Further m, = 21°82 and ms, = 4°54. 
We may now apply R. A. Fisher's z-transformation to (24), and write 
Ne 
4= 22 
Ne + Me 
Ny = 2mz = 9°08, Wp ee Dinds EASES 5.05 cecsccesekcad (28). 
* «*On the Problem of & Samples.” Bulletin de V Académie Polonaise des Sciences et des Lettres. 
Série A. Sciences Mathématiques, 1931, pp. 460—481. 
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Size 
of 
Sample 


CO ~1 > OF & Ce bo 





ti Cobo 











TABLE A. 


Percentage Limits for the Distribution of Range in Samples 


Mean 


1°12838 
*69257 
05875 
*32593 
53441 
*70436 
*84720 
‘97003 
3°07751 


© bo bo bo bo to 


| 3°17287 


3°25846 
3°33598 
3°40676 
3°47183 
3°53198 
3°58788 
3°64006 
3°68896 
3°73495 


3°77834 
3°81938 
3°85832 
3°89535 
3°93063 
3°96432 
3°99654 
4°02741 


4°05704 | 


$°08552 


4°21322 


132156 | 


4°41544 
4°49815 


*80598 
*85355 
4°S9789 
4°93940 
4°97841 


ee ee ee ee 


501519 | 


“S7197 | 
"63856 | 
‘69916 | 


*75472 | 


From a Normal Population. 





Standard 
Deviation 


bo 
— 





> So = 
<3) 


> C 5, > > 
= 





— pt pel 
wre 
xs 
ows 


. 
oye 
I~] 


On On Cnn nee ee 
S oe 
<3) 


bo bo WO WO WD WO WO WD WW bo 
C w . 
~ 











Lower Limits 


A fo. 1 Sho 
a) eae 
| 

02 | -09 
"22 | -45 
47. | -77 
‘70 | 1:04 
89 | 1°26 
1°07 | 1°44 
1°22 | 1-60 
1°36 | 1°74 
1°48 | 1°86 
59 97 
69 | 2-07 





dS tS ow RK RK eee 
Dh) met Oe : 
— 
NWNNWNNHNNWNNHNW 
¢ w ex 
) 


t =m 
C — 2 


2°31 | 2°67 
2°36 | 2°72 
2°40 | 2°77 
2°45 | 2°81 
2°49 | 2°85 
2°53 | 2°89 
2°57 | 2°93 
2°61 2°96 
2°65 | 3°00 
2°69 | 3°04 
2°84 | 3°18 
2°97 | 3°31 
3°09 | 3°42 
3°19 | 3°51 
3°28 | 3°59 
3°36 3°67 
3°43 | 3°74 
3°50 | 3°81 
3°56 | 3°37 
3°62 | 3°92 
3°67 | 3°97 
3°72 | 4°02 
3°77 | 4:06 
3°81 | 4°11 


DoK eee = 


bo bo BO PO bo bo bo bb bo bo 


Co te Ge Ge Ge Go Se bo bo bo 





Or Ot He He OS BD BO 
SaeIbDAe aoe 


— > peer e he eee ee 
S-1-1 DAD 
O-1w 


4°83 








Upper Limits 





S%o.| lo 
2°77 | 3°64 
3°34 | 4°10 
3°65 | 4°38 
3°87 | 4°59 
4°04 | 4°74 
4°18 | 4°87 
4°29 | 4°98 
4°39 | 5°07 
4°48 | 5°15 
4°55 | 5°22 
4°62 | 5°28 
4°68 | 5°34 
4°74 | 5°39 
4°79 | 5°44 
4°84 | 5°49 
4°89 | 5°53 
4°93 | 5°57 
4°97 | 5°61 
5°01 | 5°64 
5°05 | 5°68 
5°08 | 5°71 
5°11 | 5°74 
5°14 | 5°76 
5°17 | 5°79 
5°20 | 5°82 
5°23 | 5°84 
5°25 | 5°87 
5°28 5°89 
5°30 | 5°91 
5-41 | 6°01 
5°50 | 6:09 
5°57 | 6°16 


5°98 | 6°54 
6°02 
6°05 | 6°60 
6°08 
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of 
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The 5% and 1% limits for z may then be found by interpolating in his 
tables* with the following results : 


5 v4 point 2z=°'371, L=°696 

1% point 2z='519, L= a 

The observed value, Z =°702, lies close to the 5%, point; or differences in 

group variation as large or larger than those observed might be expected to arise 

through chance in about 1 experiment in 20. We can hardly therefore feel 

confident that they are significant, and find no reason to be dissatisfied by the 
cruder picture provided by the range test. 


In conclusion I would like to acknowledge much friendly assistance received 
in preparing this paper. Table A is built essentially on the earlier work of 
Mr L. H. C. Tippett, who has also lent me certain unpublished material which was 
of much value in its construction. “Student” too has given me many helpful 
suggestions, and the work was undertaken in the first instance with the knowledge 
that both Tippett and he had proved the utility of the range criterion in certain fields 
of practical application. 


In addition I am most grateful to Mr M. R. El-Shanawany for a large part of 
the lengthy computation of the framework distributions for Table A; to Dr L. J. 
Comrie for supervising the computation of o, for the cases n = 45 and 75; and to 
“Mathetes” for the loan of the working sheets on which the theoretical distributions 
contained in “Student’s” 1927 paper on Routine Analysis were based. 


* Statistical Methods for Research Workers, pp. 212—215. 








THE CONVERSE OF SPEARMAN’S TWO-FACTOR THEOREM. 


By BURTON H. CAMP, Wesleyan University, Middletown, Conn., U.S.A. 


1. Introduction. There have been several attempts to prove, or to disprove, 
the converse of Spearman’s two-factor theorem*. As shown by Irwin, these various 
methods of proof result ultimately in the same expression for the so-called general 
factor. Although the several proofs are necessarily alike in many respects, the 
different authors appear to have different pictures in mind at the background of 
their analytical demonstrations. The same may be said of the proof presented in 
this paper. In addition, I have inserted a certain necessary but hitherto neglected + 
hypothesis, have investigated the possibility of the use of other than linear functions 
as the basis of the formation of the general factor function, have given a numerical 
example in which this factor is not unique, and have discussed more fully the 
important additional question raised by Piaggio as to whether this factor is 
“almost” unique. Let us begin with the neglected hypothesis. 


2. The Net Correlations. It will first be shown that Spearman’s proof does 
not hold in a specific case. The reason it fails is that he has produced a certain 
determinant as a definition of his general factor without proving that there exists 
a frequency distribution for which such a definition would be possible. In presenting 
this example I shall use his notation, to which the reader may wish to refer, unless 
he chooses to read the later sections of this paper first. My notation, and Spearman’s, 
will be explained in the sections following. 


Let rap, Tac, etc. be as in the following table : 








a b c d 
—$$$_—_—_—_ 
in te 5 35 | 
| 6 So oe 49 | 

c 5 | — 306 | 
d oo © wm — 





Evidently, this set forms a perfect hierarchy, thus satisfying Spearman’s only 
hypothesis. He now defines r,, so that 
1 


— on ak . i we fos 
Yay = — = VXaqhaq> Nag = Tak! Vkq » 
Ka 


* C. Spearman, The Abilities of Man (1927), Macmillan Company, Appendix, pp. ii—vii. E. B. 
Wilson, Proceedings of the National Academy of Sciences (U.S.A), Vol. x1v. (1928), pp. 2883—291. 
H. T. H. Piaggio, Nature, January 10, 1931, p. 56. J. O. Irwin, British Journal of Psychology, Vol. xx. 
(1932), pp. 359—363. Cf. also J. C. M. Garnett, same volume, pp. 364—372. 

ft Except by Wilson, whose methods are quite different. 
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where q, & and a are chosen arbitrarily from the letters a, b, c, ..., except that the 
same letter must not be chosen twice. In particular, now, let a=6b, k=c, q=a. 


Then 5 Met On 
a C8 
— | Tete - a >1, 


Tea 
which is impossible. The tacit assumption is made, therefore, that 
Tac — Toeba > 0, 

and we are led to infer that this inequality should be predicated for all permitted 
choices of the subscripts, or, to put it another way, that none of the net corre- 
lations 1q., should be permitted to be negative. In fact, this condition is really 
necessary as will be proved later. On the other hand, I presume that it is true that 
it would almost certainly be satisfied automatically in the kind of practical problems 
to which the theory is currently applied, and so it may be that it is only from the 
logician’s standpoint that it is necessary to insist on it. 


3. The General Theory. Consider (for definiteness) the aptitudes a. of 
8=1, 2, ..., N individuals in a =1, 2, ..., n studies. We may always write 

(a) ep = Cage t Sa,8, 
where C, g and s are functions, to be determined, of their subscripts. (I now 
depart from Spearman’s notation. His 7, 7, § and a are the same, respectively, as 
my C, g, s, and a.) The variable a may be supposed, without loss of generality, to 
have a mean equal to zero, and a o equal to unity, for each fixed value of a, as 8 
ranges from 1 to JV. 

(6) Let the correlation r;; between a; and a;, and the net ccrrelation ry.; be 
positive or zero for every set of mutually different values of 7, j and &. 

(c) Let the totality of r;;’s constitute a hierarchy, i.e. let parallel arrays of the 
following determinant be proportional, if the series of 1’s which constitutes the 
principal diagonal be omitted from consideration : 


Yn Tre Tis --- Nan 
Yo. Yoo Mes Yen 
LD Hj cee cece een eneereseees 
| Tar Tre Mrs eee Tan | 


For convenience, we shall also suppose this determinant to have been so arranged 
(as is always possible) that 
Tn >> «+» >Vin- 
The converse of the two-factor theorem now states that it is possible so to 


determine C, g and s that, for the given N individuals, and for all mutually 
different subscripts 4, 1, 


and this is what we are now required to show. 
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Proof. There exists by hypothesis an n-way correlation solid, that is, a distri- 
bution in n dimensional space, determined by the n aptitudes a,,2,@=1, 2, ..., n, 
of the W individuals, and the total correlations in this solid are as indicated in A. 
It is necessary to assume that A is not zero, but this is a trivial assumption, for the 
vanishing of this determinant would mean that the n total regression “ planes” of 
this solid would have a “line” in common, instead of the general mean point only, 
and in other geometrical ways this distribution would be so peculiar as to be un- 
realisable in practice. We shall now show that: there exists also a distribution of 
these WV individuals in (n+1)-way space, determined by the given WV values of each 
of the n variables a, ..., @,, and by the WN values to be assigned to a single new 
variable g; thus the mean g is zero, the total correlations are as indicated in the 
determinant R of (iv), and also, for R, condition (c) holds, even when the leading 
element 1',,=1 is included, and thus, finally, for all mutually different values of k and J, 


Wag TNR Win ckh hemes sencdesheqensaatincentel (ii1), 
Tog Yor Yo2 +++ Ton 
ye "2, Te -- Cin 
"eg Yor Tee Ven : 
ee ee ree ee (iv). 
i ere rice See 
Yng Tai Tne +++ Tan 


First, such a set of positive numbers does exist, and none are greater than unity ; 
for, put 7y9= 1, and let ry, 772 and r,3 be determined by the equations 

rye = Tg Vg92 

113 = hits SAveMan nena ceecerdnses Cer aeae eawent (v). 


'o3 = lye gs 


The solution of (v) is 
[tats 112723 113723 
ra = » Tr= = wa , 
\ 193 : V 1s : V tre 


and hence by (b) the three numbers 741, 7g2, 7g3 are positive and not greater than 1. 
Then, if 1 >3, let 


— ] = Yu Tu . 
en ee Sacer es — + Teg. 
"1g V "13 112 
Since ry < ry and 74, < 733 and 73 < 1, it follows that 
iG D  ccddinnssnvesausscssneniscscimininciags (vi). 


It also follows from these equations that 
ret 
| = / ~— ; etc., 
Vel 
for all mutually different values of & and /, and so, adopting here the notation of 
Piaggio and others, we have 
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All these interrelationships may be represented compactly by the following matrix, 
M, in which parallel arrays are proportional, now without the exception of the 
principal diagonal: 


1 Tal ? gz ? gs Tn 
| yg Yo Tie 13 Tin 
| Y2q ra ae 123 Ten || 
M= BP Ee rr ee arene ee Hs aukwanenkametene ee (vill). 
b - Rutewnn ceanaes FERRNRROE bhakcenbe oo 
| Eee A SE CREO VE op BT | 
! Trg Tn Tre Tn3 ++ - \ 


So far, we have proved that there actually does exist a set of numbers as 
indicated in M, and that each of them is positive and at most equal to unity, but 
we have not shown that they can all be correlation coefficients. That depends on 
whether there exists an (n + 1)-way frequency distribution whose total correlations 
are these numbers, and this is the important thing it now remains for us to 
establish. A simple but rather helpful picture of the method is afforded by a case 
of three variables. Consider a two-way frequency distribution between x and y. 
This may be represented by a set of points on a horizontal (wy) plane. Of course 
Tz, 18 known. Now suppose we have the problem of constructing a three-way 
distribution in xyz space, having the same r,, and also some pre-assigned r,, and 
Y2x- It is clear that we may move the points vertically at pleasure withovt dis- 
turbing « or y or r,,. If we do this im such a way as to produce the proper ry, 
when the points are projected on the yz plane, it may be that we will not thus 
have produced the proper 7,, when they are projected on the zx plane. The 
question whether we can do both at once depends on how many points there are 
and how many distinct conditions must be fulfilled; and the most direct way of 
solving the problem would seem to be, therefore, to write out the conditions and 
count them. This is what we now propose to do, but our points exist initially in 
space of n dimensions instead of two. The (n+ 1)th direction is the direction of g. 
The number of points is VY. Each of these is to be moved in the g direction to its 
proper position, and the co-ordinate of that new position is also to be called g. The 
co-ordinates of the NV points, as located ultimately, are subject to n conditions only, 


Viz. 
. , Ps . 
= LGR; =Fe%qiz, (C= 1, 2, ...5 RM) .os.rcveeree (1x), 
Ne 
o> Macs 
where o, => =9°, 
. N 8 


and from them the function g is to be determined. Since there are n conditions, 
the most natural function of the a’s to assume as the form of g is the linear com- 
bination 

Fe Aa Gg $ occ H Ag Gg wacreseecceveseccvcovcsneoeces (x) 
but this is by no means necessary. Any other function involving n arbitrary 
constants might do as well, or better, and we shall see later that we may start with 
vastly more general functions. It happens too that this linear function is not quite 
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sufficient by itself, because when inserted in (ix) it would lead to » equations in 
the A’s which would not have a solution. Due to the presence of o,, these equations 
are not linear in the A’s. In order to meet this new difficulty, it is necessary to 
examine it more closely. Solve Equations (ix) inserting g = y, 
ey : 
ay = (Arai +... + AnQn) @;=Ggrg, (1=1, 2, ..., 0), 

N B + 
first as if og were a known number, noting that they become 

Airn = Aeriz Fe A, Tin =T91%q 








series devebe Gtng Sooinbeaeveeonadn. Tk. Peak ameks coakeee (xi). 
Aj Tait Aofng +...4+ A a’ran = rgn Tq) 
Due to the proportional relationships that exist among the 7’s, 
3_ 1)... (42-1 
A= (HA ‘ (um a 4 ) 
K Mi--+ Bn } 
erg a i ‘ Ss 
and the solution is A;=K*o, —- i? OP Wir ee, enters (x11), 
‘e 1 
where K*= pe. 
1+ — aes , 
ne l 7 . fie —!] 


This is not a true solution, because oy, is a function of the A’s, but (xii) is a more 
illuminating form in which (xi) may be written. Together with (xii) we must now 
satisfy also the equation o,=o,, where, summing for all individuals: 
so Big , " 
o,? = y ~ (Ara + Fal eS) ee ere ene EN (xiia) 
It will be shown that this cannot be done, except in the trivial case where all the 
A’s vanish. By (xia), 
o,° = (Ay? +... + Ay?) + 2(AyAsris + -.. + An-1AnTn-1,n), 
and by (xii), 
. Mi PiTij K*o3 


A; Aj;7r;; = K4e,2 — , =, - 
si 7 (u?~1)(urzF—-1) (u?—1)(u?-1) 


" \yat-—1 es | 


l l ) 
+ ate ee = | 
rors 1) (yo” — 1) (fn—1® — 1) (y,? — 1)/) 
fal Gent ht SPORE Reentry Mu ede 8 “UE Seren cane mee (xii) 


La hl J . 7 ** . . . . . 
herefore, we cannot satisfy both (xii), which is the same as (xi), and the condition 
that o,?=<,?, i.e. we cannot let g=y, unless K=0. The extreme case where K is 
zero, or near it, is discussed in Section 5. It is the condition for an “almost” 
unique g. 

Having moved our points up to the position occupied by the plane 


Sf te Agi naa H glla. GeakssnhubenssvasNekeesueds (xiv), 











| 
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and having found that they satisfied all the requirements, except that the standard 
deviation was too small, the obvious thing to do next would be to move them away 
from this plane so as to secure the correct standard deviation but not disturb what 
has already been accomplished. So we make this plane the regression plane of the 
points in their final position, denoting by ¢ their distances (parallel to g) from it. 
This gives us a good deal of latitude in our choice of ¢’s, but we are restricted some- 
what. Let g =t¢ + y, where, for each 7, 


| tmea s the he Me (xvi), 
and so we must choose ¢ so that both (xv) and (xvi) will be satisfied; then (xii) 
will yield the solution. This may not be the only way in which one may obtain a g 
which will have the necessary requirements, but it is certainly one way. By (xiii), 
condition (xvi) is the same as 

og =o +o,7(1— K%), 

and hence GEO ill  seseeseninscancceteienvenbirend (xvii). 
This choice of o; is arbitrary in so far as a, is arbitrary, but in order that g may 
be compared with the a’s it is desirable to have ¢, = 1, and so o, = K, and 

o,y= 1- Kk? 

Thus there is a definite restriction on the freedom of t, and, as stated earlier, it 
has to do with the almost uniqueness of g. Postponing that subject, we now com- 
plete the proof. It proceeds from here as indicated by Spearman and others, and 
so may be indicated very briefly. It remains only to show that (i) and (ii) are 
satisfied It follows from (iii) that the net correlation, ryz.,=0, if & and J are 
different. The total regressions of a, and a; on g are: q =?ry.g and a;=rygg, and 
it is known from general correlation theory that this net correlation is the simple 
correlation between (a,—7gx,g) and (a;—rgg). Set these equal to s and s; re- 
spectively, and let C,=7,., and it follows that we have so chosen the variables of 
(a) §3 that (i) is satisfied : 


e 


To obtain (ii), compute the correlation between s; and g. Since 


rs Py | 


= (a — 97% nx) = 0 and o,,? = V = (Qe — Gree =1— rg? «..... (xviii), 
B = 6 
1 . 
‘a9 = > 5  (u — 9" x) 9 = / 2 (T9z — Tox) = 9, 
N V1 — 1,2 6 V1 —Tr@ 


as desired. 

4, Necessity. We shall now show the necessity of the condition in (6) rjj.z>0, 
namely, that it follows from the two-factor theorem and from this theorem only that 
rij >in? jx. It is quite well known that it follows from the two-factor theorem in the 
direct (not the converse) form that 
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where ry; is a correlation coefficient which is positive and at most equal to unity ; 
80 ij SVK jK- 


5. Interpretation. As stated in Section 3, the values of g may be assigned to 
the V individuals in many ways, at least provided NV is greater than n+1. This 
means that we may choose, for example, for the general factor applicable to the 
first individual a value twice as great or half as great as the one assigned. to the 
second individual, at pleasure, and still meet all the conditions of this theorem. 
This sort of a general factor is not, I should think, what psychologists desire in 
order to establish the two-factor theory of mental abilities. If the number of ways in 
which one may assign values of the general factor to the NV individuals is very large 
instead of unique, it would seem to me doubtful whether from a psychological 
point of view it would be meaningful to assert the existence of such a “factor” at 
all, but that is obviously a question for psychologists. That it is true, however, 
that the general factor is strikingly indeterminate is well established in the 
numerical example of Section 6. Consider the 11th and the 12th individuals 
(8 =11, 12) of that example. For the 11th, the general factor equals 0°85 for one 
choice of t’s, — 1°77 for another; for the 12th, these figures are reversed. The scale 
(c) of these measurements is unity and the origins are at their means, The dis- 
parity is perhaps the more striking because both these individuals have been 
assigned the same scores on all four of the tests. It would not be relevant to assert 
that I had made a peculiar choice of ¢’s in order to secure these differences, that a 
random choice would probably not have produced them, for the choice of #’s is not 
restricted to be random; it is arbitrary. The so-called arbitrary constant of inte- 
gration affords a perfect analogy. Given an hierarchy, one can assert the existence 
of a general factor in exactly the same sense as, when given an analytic function, 
one can assert the existence of its anti-derivative or “indefinite” integral; there 
will exist a family of general factor functions, every member of which would lead 
to the given hierarchy. 

Admitting this, Piaggio and Irwin have asserted that nevertheless the general 
factor g is what one might call “almost” unique, in the sense that the possible 
fluctuations of g from y are small on the average, at least in certain circumstances. 
I shall discuss this fluctuation in a moment, but let us consider first a possible 
demurrer. If their contention is justified, it means only that, for any choice of ¢’s, 


the average deviation of g from y is small; each member of the infinite family of 


surfaces of which y is the common regression plane lies on the average close to that 
plane. Not every point of it needs to be close to that plane; whatever individual 
be selected, it is possible to find at least one member of that family for which his g 
will differ greatly (if NV be much greater than n) from y. For any given hierarchy, 
then, we may write down a ‘vhole family of two-factor patterns, some of which will 
ditfer widely from each other in the case of any previously selected individual. 
However, if the contention of these authors be justified, it does follow that for each 
member of that family the majority of the individuals must have q’s close to y. We 


do know that our hierarchy must have associated with it a set of g’s which deter- 


| 
i 











Burton H. Camp 425 


mine one or another member of this family, and this set as a whole lies close to y. 
Therefore, the group behaviour of the N individuals might be expected to be as if 
g were unique. So an investigation of the fluctuation of g is warranted. 

We have seen that, if ¢, =o,=1, then o, = K, o,=V1—K?2, g=t+y. Itisa 
question whether the approximation, g=y, is a close one, and this depends on 
whether (o;= K) is small, relative to (o,= 1), say as small as 0-1. In terms of 
the 7’s, 

re awe ee 
Kk? l-n,* 1-r,,*° 
There are two ways in which 1/K can be as large as 10. One is when the early r’s 
are very close to 1. From the equation, 
aig = Tage + Sig, 


we get La? =rg Tgay + 25.44. 
8 8 8 


Hence, using (xviii), L=rg*+t.,0,%) Te, = V1 —Teat® 


So, if ry, is close to 1, 7s,¢, is close to zero. Moreover, for values of k different from 
1 it follows from the earlier results (§ 3, i, and ii), that r,,,,=0 exactly. Therefore, 
a has by hypothesis all the properties of the general factor sought, approximately. 
In other words, this is a trivial case where the g of the two-factor theorem has 
already been found, approximately, among the aptitudes measured. 
The other case is where n is so large that, even if each term of 1/K? is small, 
yet their sum is large. For example, if ry = 0°5, for all & Fl, rae = 0-5, and 
1 
= =1+4+1+...+1, 
Kk? 
so that 1/AK =10 if n=99. To accomplish the same result when r= 0°25 would 
require the measurement of 297 aptitudes. Practically, it would appear difficult to 
secure cases of hierarchies among aptitudes as numerous as this with intercorrela- 
tions as large, but if they should occur, it would then be true that the group general 
factor would be “almost” unique. It is to be remembered that for the success of 
our theory the number of individuals NV must exceed the number of tests n*. 
* The relationship between Piaggio’s notation and mine is as follows. His a=m,g+q8q, 
6,=1, gg=1,¢0,=1. My 4, =C.9+8,, where ¢,=1, ¢,=1, 7. =N 1-1". His g=k*t+ ki, where 


V1-k 

“= 
My g=y +t, where ¢,=V1-K?, ¢,=K. His u.=my w. His k=my K. Thus his indeterminate part of 
g is ki and mine is t. Now he says that by increasing his N (my n), the coefficient of the uncertainty 


where 


a 


» GF 


term, k, ‘‘can be made as small as we please,” and concludes from that that g may become almost 
unique. This does not seem conclusive, for if his & is less than 1, his % is still smaller and that is the 
coefficient of the determined term (his t, my y). Irwin rightly judges that it is a matter of standard 
deviations as well as of coefficients, but he says that k~* can be made small at pleasure. This looks like 
a misprint for k*, for, as noted before, it is the smallness of k® which is needed to render g almost 
unique. 

According to Wilson, any set of n aptitudes which do not lead to a hierarchy may be replaced by n 
artificial aptitudes, which are certain combinations of the given aptitudes, which do lead to a hierarchy. 
Since n may be as large as desired, it would follow that there always exists a set of x artificial aptitudes 
for which there is a group general factor which is almost unique. 
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This argument for uniqueness has proceeded from the assumption that the form 
of the function g was g = y +t, where y was linear and t had special characteristics, 
but it was remarked that it was not necessary, initially, that y should be linear. 
Indeed, returning to (x), we might have written g = ¢ + t, where 


i) -_ Aifi Piss. Se Anfa; 


and f; was any function of a; such that 
TA=0 and SffZ=N, (é=1,..., n). 
8 6 


In that case, values of A; could have been found, usually, subject to the same con- 
ditions (ix) as before, and from that point on the original proof would have held 
good. But, nevertheless, in that case, the set of values found for g would still have 
constituted some sort of a frequency distribution in (n+1) space and this distribu- 
tion would have had a regression plane. Also, it is known from general correlation 
theory that the differences between the ordinates to such a plane and the values of 
g thus found would have had the properties previously ascribed to ¢: 


st=0, Lat =0. 


Hence this new g could be represented as the sum of a linear function and such a 
¢; and so it has now been shown that our original solution was in fact the most 
general one possible. 


6. Example. For the individuals 8 =1 to 40, let the a’s be the observations, 
having the values indicated in the table. It follows that the determinant R (see iv) 
is as below, indicating a hierarchy of r,s and the r,s associated with it. The y’s 
are determined from the equations 








6 oF 
y = ——= (q+ as) + —=(Get Ma), 
(Vo 57 
6 2 is 3 . 
A,= =) As= — As= Aj, A,=Asg. K* = or» N = 40, n=. 
7 V5 5V7 35 
fe} ay aly ay ay y 
] 5 Ye 2 ¢ « 12¢ - 1°37 
6 10 0 Y¢ ( ( 6c ‘69 
11—15 —c ( 0 0 - 4c — “46 
16 20 —¢ — ¢ -Ve O —10e | —i'l4 
21 25 — ( c c - 3 -— 10e 114 
| 26---30 ( ( ( c 6c | 69 
| 31—35 c c c c 8c | ‘91 
| 36—40 —¢ |}. =e |. = = - & | — 91 
> 0 0 0 0 0 “Ol 
o l 1 l l ‘91 
c 2//5 I/N7 2/V5 2/V7 1/35 
| 











4 
; 
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‘ 
| 2 2 2 o. 
| se 
Jd NT V5 V7 
2 1 4 + 
V5 /35 5 35 
2 4 1 4 + 
R=! 77 Jes ve” FT 
2 4 4 4 
a a. oer 
2 4 + 4 1 
| /7 /35 7 /35 
' 
The ¢’s must now be chosen so as to satisfy the relations (xv) and the condition 
that o,?= 3/35. Two simple choices are: 
Case 1: t= V¥12/7 =1°31 for B=11, and t=—1°31 for B=12, t=0 for all 
other §’s. 
+ 


Case II: t =—1°31 for 8=11, and t=1°31 for 8 =12, t=0 otherwise. Thence 
g=y+t, and then the s’s are determined so as to satisfy the following equations 
for each B: 


a, = 2q V5+51, Q= 29 V7 + Se, 


t 
ag = 2q/V5 +83, @a=2g/V7 + Sg. 
Some of the values of g and s; are approximately as in the next table. 
Case I Case II | 
8 : | 
| ( $ ] s 
} 1 f 1 | 
10 69 — ‘62 “69 - €2 
| li 85 1-65 1-77 69 
( 12 -177 ‘69 “85 165} 
13 16 — 48 - “46 — “48 














THE DISTRIBUTION OF THE INDEX IN A NORMAL 


BIVARIATE POPULATION. 
By E. C. FIELLER, B.A. 


The Probability Integral of the Index Distribution. 


Consider the distribution of the ratio 


4 
v= ¥ 
x 
in any bivariate population z=f (a, y) 
Teo +o . 
where | | ST (a, y) dady = 1. 
—o J—o 


Cee meee eee ere essere e tresses sseeeee 


Points (x, y) corresponding to a given value of v lie on the line 





OE oa ca dhesiccsoccewvetcsnsesccsscestsuees ( 


hence the chance that a random member of the population (1) will have an index 
lying in the range 11 <v <p is equal to the volume of the portion of the frequency 
surface (1) that lies above the area swept out in the wy-plane by the line (2) as it 


revolves in the positive direction from the position 


MP MEMGEE atacn vouch vaneecouvecaneeeebocwenntses’ (3) 


to the position WENGE Seah verdneeesdtnaresed nde vn waenvenns (4). 


If we take v}=— ©, this volume is the chance that the index will not exceed 


9; the line (3) is then the y-axis, and the volume jis 


CO Vox 


Ve= | +| | FT (a, y) dady kiana Wanseetacesawneede (5). 


When the joint distribution of # and y is the normal one, 


1 zy : {(# #\* oy 72 Y y Le 
z= e 21-r? \\ o, C, Cy 


Qro,0,V1—1 
(5) gives, for the chance of an index not greater than », 


1 : a® ane 2 

r . rs 3- or t 

V= | e 21-r* \oe,? Oy Ty 
Jat+b 


27 o,0,V1—7 


where a and b are the two portions of the «y-plane indicated in Fig. 1. 


The boundaries of a and b are the lines 
“a+z2=0, 
y+y=v(et+ @), 
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and if we put «= &, y—va#=y, the portions a and 8 of the &-plane that correspond 
to a and b are bounded (see Fig. 2) by the lines 

£+z=0, 
n+y—ve=0. 


WWE 
\\ 





\Y” =0 


a 


. ‘. ‘ 


Thus we have, performing the change of variables, 


a i 1 18 
V= | pie 5 Ge Bonntuinceininennaieas (8), 
Jats 2 y ji 


























1 {é = E n + v& + (” + “e)"| 
) 


2 é 
—?T" (Cz Gy Gy Gy 


2 {? fe. 
2. 2p = - + hal 
Hos, 


° 
where Y= 


Mee eee eT. (9). 
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From the last identity we have 
of (1 —p*?)=0,20,7 (1 —2*)/(o,7 — 2rve,0, + va,2), 


o,7(1 —p*)=a,7(1 am *), 


1 p ;, 
oro 1—p)? = (roy — v0z)/o,0/ (1 —1°); 
” 
squaring the last of these equations, and multiplying by the other two, we get 
p? = (ray — Vag)" | (Fy? — Wo gy + V2Ty7)......scee scene (10), 
whence 1 — p?=(1 — 1°) o,7/ (0,7 — 2rvogay + 020,7) «0.0... eee (11), 
in sinned sig hcl aatoteh tus Tene eeeCONS nh oe dake naes (12), 
Oy = (O42 — Wo goy + V7G2)E ..cececsceeeeeees beveeeieud (13), 
and oyo, V1—-p?=0,0, Nene OA IO EIR ta ee eR: (14). 
. y ax 
Write X = & =—, 
Gg Gy 
Y= a Y— vr 


iui D : ¢ 9 9\i ; 
o (0,2—2rve,c, + v*o,7)? 
the quadrants A and B of the X Y-plane that correspond to the portions a and b of 
the ay-plane have as common corner the point (—h, —k), where 


k = 
(a, — 2rvo,0, + vo,2)3 
From (8), 
1 


1 
oro ~ = (X2 -2pXY+ Y?) - - 
v=| |;  ,. a 9 dX ay, 
A+BJ2 


/} 2 
TO yoy Vi—?7 


so that the chance of obtaining an index not less than v is 


it Fe ae: 4 
“w vr* ah © 1 X*—%X¥+Y9 we ae pay 
C=1-V=| | +| | a re | lala dX dY (17). 
h Jk -h J -k 2 V1 — p® 
Here h and & are given by (15) and (16); equation (10) provides two values of 
p; we decide which is appropriate by noting that as v-—> o, the point 


(h, ky» (= : ie i! 
ox Ox 


so that to make 1 — V—0 we must take 
a 1Oy — U0 x 
' (o,? — 2rvo,cy + va,2)3 
Tables are already in existence by means of which the numerical value of C 
may be found*. These tables show over a range of values of p extending from — 1 
to +1, the value of 


— vo BO oe ees 
"| Z(p)dXaY = | | ong S-pe  akar 
k h 


Jh k Ir V1—p? 


* Tables for Statisticians and Biometricians, Part I1, Tables VIII and IX. 
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for positive values of h and k; for evaluating the integral when fA and & are not 
both positive, we use the relations 


[" I" Z(p)dXad¥=[- ae ay |" |" z(—p)axay, 
—h Jk Ji V2Qar h Jk 

i " Z(p)aXd¥ =|" ee R | 2(-pyaxay, 
h J-&k h NQa Jha Jk 

|" [° 2()axay =1- [°—e-#* ax - | —-e-*" a 
—h k Je NQar 


Jha V2Q9 
+[ |° 2(paxay. 


Thus if the h and k provided by (15) and (16) be both positive or both negative, 
we have 


~o a 1 : Po to z 
C=1-| + st 9-¥"de+? | Z(p)aXa¥......(17a); 
J in| \k| WQer Jini Jie 
while if A and k be of opposite sign, 
: TO (PaO l * Lp? oO me?) aay = ‘. = 
O=| +] —~e-Hde-2/ | Z(-p)dXdV......... (175) 
Jin) J le) Qa J inj el 


Frequency Distribution of the Index. 
By differentiating V with respect to v, we have the frequency distribution of v. 
Since A does not vary. with », 


dV _0V ap , aVak 





HO —_ cccccecccecccccccceccccccccces 19) 
dv Opdv_ ok dv ( 
By a well-known property of the normal surface, 
0 1 ~ : 1 (x*—2xy+¥3)) 
9G = 6 21-/)* \ 
Op 2QrVv1 —p* ) 
a | l = 1 (X?—2X¥+Y¥%)) 
a eg e 21-, Pe 
oA 0} (Qa V1 —p* ) 
so that (17) gives 
OV ii 2_¢ 4 Ke 
_V _y 1 9 B1_ a MP Behk + 
/p ar V1 — 
ees zy FF 
1 —- (5-2 + r) 
<a rE = @ 21-r Cy Ty Ty YJ cccvevecececcccece (20) 
7NV1—p* 
by (15), (16), and (9). 
a 0 O,.0,-(1 — 1”) (roy — Voz) 
FE rom ¢! 1 ), aa Pp “3 =— ¥ ) E ‘ ¥ o = > 
ov (oy — 2TU0z,0y + Oz) 
which, with (11), (18), and (20), gives 
aa =< L 22/2 oe Poe 
aV Op = _T2Fy v1 lea SO (= a o,, oy 3 =) (21) 
Op dv wr (o,7— 2rve,oy + 7.) 
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From (16) we find 


> 9 ok _— — = 
(0,7 — 2rve,0, + vo,2)8 ay oY (rio, — Foy) + vo, (YEOy — Yox)..---- (22), 
and from (17) 
r ) : roo 1 . P 
oV i 1 f [- ; “ : i 3 (x? - 2prk + k*) on | a . - ; ; a + 2pxrk + k?) d i 
Ok Qa V1 —p? Va J—h 
1 2 2 2 
=z-e } } | e~ 4" du— | e~ dul 
7 \< h-pk h ~ ph 
J1 -p J1l-- 
Eth 
1 of /1- 2 
—@e~%* |s og oe EERE PON, SORE Ree Rarer Mee ee (23) 
TT 0 


Inserting the values of h, k, and p in (23), we have, using (20), (21), and (22), 
the frequency distribution of v: 
/ Tt ae 1 1 x zx j Ti 
7,0,V1—9r a =e a ee A Z) 
4 a a a ae co," oy Cy Cy” 
y — 27VT gry + Se 


¥()== 


1 (y —vz)? ei 

ak ay ae GES TAL TY, — LOy) + VEg(TLOy — Yo) 

+e 206,7-2rve,c,+v*o," Ty (TYFe sates Zy) + dad i P J = 
t (oy? — 2rva,oy + v0,7)* 


o, (ric, —Zo,) +ve, (ri, — Yo,) 


~ {(1— 9%) (oy? - 2rve,oy+ v*¢,*)}3 e734” du (24) 


0 
We can obtain this distribution in a somewhat more direct manner. If we write 
etek 
Y 0 (av) |_|. 9: 
all CA TT (Zo), 
x ‘] (xy) 


we have, from (6), the equation to the joint distribution of « and v: 


a » 2 ‘ \(* ==) o x-Xvxr-¥ rv v)) a 
p (a, v)= ¢ 3i-r ii & 6, Cy . oy Pee 
j 2 
2rroz,0,V1—7* 
On integrating this equation with respect to «, we arrive at (24). As we should 
expect, (24) is not altered if we increase %, 7, o,, and o, in the same ratio*. 


Distribution of the Index in a Curtailed Normal Population. 


The two terms of which the second member of (24) is the sum are essentially 
positive ; accordingly, the moments of the index-distribution W(v) will be infinite 
with the contributions of the first of these terms. It is obvious that these infinities 
would not arise if we restricted («#, y) to some limited region in the positive quadrant, 
since then the index-distribution would have a limited range. 


[* An erroneous solution of this problem having been sent to me by Mr G. A. Baker, the above 
solution (24) by Mr Fieller was forwarded to him with permission to use it. The result in Equation (24) 
was published by Mr Baker in The Annals of Mathematical Statistics, Vol. 11. p. 5, February, 1932, the 
mention of Mr Fieller being unfortunately overlooked. Ep.] 








Se 











E. C. FIELLER 433 


Suppose, for example, that we take the joint distribution of # and y to be 


oe (=) x-Zy-y (Sa } 
oe “ej, ees, — ir —— —* + 1° 
z=ze 21-r ox Sn «(Gy Fy ) ee (27), 


provided that («, y) lies inside the ellipse 
m\ 2 7 rr 2 
“—-% x—HXy—i y— ’ 
( ) = Sp SF IF (" *) ae ee (28), 
ox Cy Cy Cy 

which is a probability contour of the normal surface (6), and zero if (x, y) lie outside 
this ellipse. 

Then by applying the transformation (25) and integrating out with respect to «, 
we have, for the ordinate of the curve of distribution of », 


at. (v) 2 & (° -%\* or t= Evxe-y + (“ - vy 
Taj=al |wle 21-7 We) ee ey )f de......(29), 
/ a1 (v) 
where #(v) and #(v) are the smaller and greater of the roots of 
= ‘ ot 
x —@\? “x—-Eve—7} ve—7\* ., . 
( ) és} = Se +( “—H) a a ..(30). 
ox oe oy Cy 
2 ay 2 2 
eee Oo, — 27U0 7,0, + U°o 
Writing a = 2 — ay = 


vg ty 
Oy (TY oy, — Foy) + Vo, (Hay — Yoz) 
: 3 eta | 
“a yu \ (31) 
e , 87 ,F 
<r or 2r a oe 
Gy Fy Gy 
ro (1 —7*) (ve — y a 
any —e——. > 
a Oy — 2YV0g0y + UGy 
we have, for (29) and (30), 
I (A) = 20 | xa|e 
2 ty 


where #, and #2 are the roots of 


Now if the ellipse (28) lies entirely in the positive quadrant, the index must lie 
between two positive values given by the gradients of the tangents to the ellipse 
from the origin. For these values a, and x, coincide, so that they are given by 

7—e =0. 

For all values of v lying between these limits, 2 and a are different and positive 

so that 8 >0, and (82) gives 


wy x? — 28x /) 
I (x) = 20 se fi-a Ot 2, 
x 


a 


= Ze ; er. if. (2 — F) e ; 1 = (« ey dx -4- [° Be ; 1 : r? (« «) ael : 
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The first of these integrals vanishes, since a, and «2 are the roots of (33); putting, 
in the second, 





eS or ee oe. (#.-£) a 1 
sid a A il 21-7 | e a0 


we find Ta) = V Qar —_ @ Pe. 7 -i# dt, 

a . B »/ a N2er 

(«,-6) l-r? 
or, by (33), 
A? -€ 
 ¢ \ . 
= B “aia =oF 1 » — 30? 4 
T(A) = % 46 : Tax’ We  dwenssccekremiomenee (34), 


where ¢ is to be determined so as to make the area under the frequency curve of v 
equal to unity. 


Application to Anthropometric Data. 


We return to equation (17). If % be large compared with o,, h will be large: 





, oo eee — On eee 
accordingly, | ~e~3** dX will be negligible and so, a fortiori, will 
I 


h V Qe 
: 1-4 
fe 1 . —,(X2~2pXY+Y¥? Le 
| | oes @ 21-/° 4 ax dy 
h Jk 27V1—p?* 
“& fe 1 Besa (X2-2pXY+ Y¥*) 
and | Sie Se dX dV. 
J -o -k 2a V1 — p? 
Hence the chance C of an index not less than v will be, approximately, 
: "00 1 oan ea he 1 72 y _ 
C= | =~ e~ a¥*d¥ o : Pa as kt | Me (35). 
J-kN2er J vi-y V 20 


2 wn 2. 2\$ 
(Gy° — 2'V0,0y+ 0°o,°)* 


Thus we have Geary’s result*, that if the ratio @/o, be large, then, approximately, 
"~— Y 
©. tees 2. 2yt 
(o,° — 2rv’o yd, + U'o,') 
is distributed normally with unit standard deviation. 


Differentiating (35), we see that the equation to the index-distribution will be, 
approximately, 
= . 1 (vz - 9)? . 
Y 
a, me —-fT Va, (4 — 7" 1 - = = ——- 
W(v)=- ~ ~ 7y ( Uy — TYPa) + VT g(YSu — 12y) e 206,?-2rve,c,+v*o, 
v 


F — 

(6,2 — 2rve,cy+v%o,2)!  V2r 
cet (36), 

a form that can be deduced from (24) by neglecting the first term in the second 

member, and replacing the upper limit of the integral factor in the second term by 

— 0. 


* R. C. Geary: ‘* The Frequency Distribution of the Quotient of Two Normal Variates,” Journal of 
the Royal Statistical Society, Vol. xct11. 1930, p. 442. 






































E. C. Freier 435 


It is worth emphasising the conditions under which this approximation is valid. 
It is easy to see that for some values of v the ordinate calculated from (36) will be 
negative ; but if : “= = 
t/¢,N 29 
work proceeding to r decimal places or less, equations (17) and (35) will appear to 
supply exactly the same values for the frequencies; in other words, the negative 
frequencies furnished by (35) will be zero, to the degree of accuracy of our calculations. 

Now let us return to equation (34); it shows that the effect of neglecting the 
values of «, y that lie outside the ellipse (28) is to change the distribution of 


—1y? . : ° ° 
e~ »* da vanishes to r decimal places, then in numerical 


Vi —- Y € 
= ————— ee, + 
Vo,?—2rve,c,+vo2 
from a form that is sensibly normal to the form 


) 2 
Sp Pee 
s=o—e ey | e 
NV Qar /0 

The ordinate of the normal distribution is thus multiplied by a factor that 
decreases with the length of the ordinate; the effect on the appearance of the 
curve of distribution of u, and therefore on that of the curve of distribution of v, 
will be io increase the areas near the mode at the expense of the tails. But this 
effect may easily be invisible, if is at all large; if, for example, * be eighteen 
times 1—?*, no ordinate of the normal curve within a range +3o0(¢=1) of the 
mean will be altered by much more than one-tenth per cent. 

Thus we have the somewhat startling result, that if % and 9 be positive, and 
large compared with their standard deviations, then the limitation of x and y to 
finite positive values can change the moments of the index-distribution from infinite 
to finite values, without having any visible effect on the appearance of the distri- 
bution. The paradox disappears, when we consider the difference between mathe- 
matical formulae and their numerical representation. Infinite moments cannot 
occur in the numerical applications of mathematical theory, any more than they 
can occur in experimental sampling. Any calculation of frequencies from the 
mathematical equation to a distribution will be performed to a certain number of 
decimal places; as calculated from the numerical frequencies, the moments will 
always be finite. When we say that the distribution has infinite moments, we 
mean that the numerical values of the moments do not tend to any finite limits, 
as we increase indefinitely the number of places to which we work ; but this remark 
has no practical interest—it is, in fact, irrelevant. 

What the computer about to fit a curve wants to know, are the values that the 
moments would have, for the set of numerical frequencies calculated to the number 
of decimal places that he intends working to. In any discussion of anthropological 
data *, this number will be small enough to justify the use of (35) in place of (17), 

[* This appears to exclude from anthropological data such a character as corneal astigmatism, where 
the mean =-62 dioptres, and the standard deviation “86 dioptres ; thus the character may be negative as 


well as positive. An index formed by the ratio of corneal astigmatism to distance of near point would 
need (24) rather than (36). Ep.) 
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and the frequencies will appear to be those under (36), or, if \ be taken large 
enough, under (34). But since (34) is a distribution of limited range, the numerical 
values of its moments, as calculated from the numerical frequencies, will not differ 
appreciably from the values obtained by a direct mathematical process. This, 
I think, rather than any a priori rejection, as impossible, of zero gr infinite values 
of the variates, is the true justification of Merrill’s method of approaching the 
subject *. 


If & and 7 be the deviations of # and y from their mean values, so that 
2=E+8 y=ytn, 
‘ : 2 £8 
Merrill writes y= Be, a (1 ae 2) (1 _ E + e ~ - aa *) By Ser eee PER (38), 
@+ é z y z= x" F¢ 


and takes for w,’, the nth moment about zero of the distribution of v, the mean 


value of 
n \ 2 3 n 
wm = (2) (1 +2) (1-£+8-5+...) eeeeeeees Se Cae 
HH Y i Fa FY 


to obtain this value Merrill retains the products £"7° as far as the eighth order and 
takes for their mean values the product moments p,, of the normal surface (6). 


This process would not be valid if we imagined it applied to the whole of the 
xy-plane, since the expansion (38) holds only if |&|<Z; but it is valid, if applied 
to the interior of any probability contour (28) that lies in the positive quadrant. 
In the case of low-order product-moments, we do not commit any serious error in 
taking for the product-moments of the curtailed distribution (27) the values derived 
from the whole normal surface (6), so that Merrill’s values of the moments may 
legitimately be regarded as the moments of the index-distribution in a curtailed 
normal population, which are exactly what we want. 


Illustration. 

I have illustrated the preceding theory on some figures kindly supplied by 
Dr T. L. Woo from his measurements on the Biometric Laboratory’s series of Egyptian 
skulls. Table I shows the joint distribution of 7;(Z) and P,(L)f, two measure- 
ments made, on the temporal and parietal bones respectively, in the left-hand side 
of 787 skulls. ; 

Taking P,(Z) as x, T;(L) as y, we have} 

% = 111-207 433, 7 = 86-019 060 
578857, oy = 3'8453 ib Sc cuowstaeweocceuenen (40). 


gy = 178833 j 


| 


Il 


Cx 


* A.S. Merri’!, ‘* Frequency Distribution of an Index when Both the Components Follow the Normal 
Law,” Biometrika. Vol. xx4, 1928, pp. 53—63, 

+ For the precise definition of these measurements see T. L. Woo, ‘‘On the Asymmetry of the 
Human Skull,” Biometrika, Vol. xxu. 1930—31, pp. 326 and 327. 

t These constants, and those of the sample distribution of the index, were kindly calculated by 
Dr T. L. Woo. 
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f T;(L) and P2(L). 
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TABLE II. The Index Distribution. 















































} 
Frequencies Ordinates 
| an 
| oe Observed | Calculated v From (41) | From (42) 
Pera | = th | | 
“60 ao } “595 | 1 1 | 
| “61 : 605 = 3 
| 62 — ‘615 | 4 9 
| °63 625 =| 22 22 
| *64 = | *635 50 50 
| 1 | "645 106 106 
"66 6 | "655 211 211 
‘67 | 6 | 665 | 389 389 
68 6 675 672 671 
“69 12 685 1086 1086 
‘70 16 695 | 1650 1649 
71 31 705 | 2359 2359 
“72 30 715 | 3184 3184 
73 49 | 725 | 4064 4065 
“74 47 735 4920 4921 
*75 69 745 5660 5661 
‘76 77 755 6200 6202 
Pa 63 | 765 | 6483 6484 
“78 56 775 | 6483 6483 | 
“79 60 785 6212 6211 
80 67 795 | 5715 | 5714 
81 46 805 | = 5057 | 5056 r 
. Ja? 815 = | = 4313s | 4312 
Ss | 3 | 825 =| 3550 | 3550 | 
nn “835 | 2826 2826 
85 | 13 | 845 | 2179 | 9179 | 
*86 13 855 1630 1630 
‘87 .- 865 1185 1185 
| -s8 | . 4 875 838 838 
“89 2 885 578 577 
‘90 | 4 895 388 388 | 
| ‘91 | 3 905 255 255 
| +92 | $. 4 915 164 164 | 
93 | 1 925 103 103 | 
‘94 2 35 64 64 
‘95 2 $45 38 38 
“96 —_ 955 23 23 
‘97 965 13 13 
“98 975 8 8 
99 985 4 4 
1°00 995 2 2 
1°01 1°005 1 l | 
The distribution of the index v= 7,(Z)/P2(L) in the 787 skulls is shown in 
the first column of Table IT. 





Let us assume that P.(Z) and 7,(Z) are distributed in a normal surface whose 
m2) 1 je 
constants are given by (40)*. We have Z/o,=19°22, so that |. all dx 
JZ/o, N20 
* Actually we have for P,(L): 8,=°0002, f,=3°5340, 
for 7',(L): B,="0064, ,=3°1576. 
From E. S. Pearson’s table of the 5°/, and 1°/, points of the distribution of 8, and 8, (Biometrika, 
Vol. xx11. 1930—31, p. 248; or Tables for Statisticians and Biometricians, Vol. 1. Table XXXVII bis), 
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vanishes to something like 80 places of decimals. The frequencies of the theoretical 
index-distribution (24) may therefore be calculated from (35); they are given in 
the second column of Table II. (Their calculation can be effected very rapidly, by 
first forming a column of the values of —~ i corresponding to the 
(o,° — 2rvo,cy + o,2)3 
boundaries of the frequency groups, and then interpolating into tables of the normal 
curve.) Combining the tails of the theoretical distribution, above ‘895 and below 
‘675, into two single classes, and grouping together the frequencies whose centres 
are 88 and ‘89, we find y? = 25°1504 and P =-290, so that it is quite likely that 
the sample shown in the first column is a random sample from the parent population 
shown in the second column in Table II. 
In Table III are shown the constants of the index-distribution, calculated 

(i) from the observed distribution, 

(ii) from the numerical frequencies given for the theoretical distribution, 

(iii) from Merrill’s formulae*. 


TABLE III. Frequency Constants of the Index-Distribution. 





Calculated from 


Frequency 
Constant | (i) Col. 1, Table II | (ii) Col. 2, Table I | (iii) Merrill's formulae | 
|} Mean | 77469 775296 ‘775298 
| S. D. -048280 “048646 “048664 
By ‘1111 0506 0511 
Be 3°6296 3°1073 2°1196 


iast two columns is quite satisfactory. 

If we substitute in equation (24) the values of %, 7, ¢,, o,, and r given by (40), 
we find that throughout the effective range of the distribution of v, the first term 
in ¥(v) is less than e~*°, while the upper limit of the integral factor in the second 
term is in the neighbourhood of — 27. These figures indicate the extreme accuracy 
with which the index-distribution is represented by (36). Substituting in that 
equation, and multiplying the second member by 787, the size of the sample, we 
find for the equation to the index-distribution 

2s 86°01906 *20743 v)* 
1032191 + 192966v |. see 
= = e 2 14°7867 —7°7386 0 + 33-5067 v- 
(14-7867 — 7'7386v + 33 5067 v7)? V 2a 
the Pearson Type IV curve, fitted from Merrill’s formulae for the moments, is 
_9 7.29F1Q ~ O2OF : 73-2815 31-34936 tan-!' 2-31722 2 ¢ 
y= 10~™ x 7-83548 (1 + 5:36951 2%) ~ 32815! elStsi036 tan” 2SITa82 (49), 
we find that while the values of 8, are not significant of any departure from normality, 8, exceeds 3-30 
in less than 5°/, and 3°48 in less than 1°/, of random samples of 787 from a normal population. It is 
accordingly very unlikely that P,(L) is distributed normally, but it will be seen that the departure from 
normality does not seriously affect the goodness of fit of the index-distribution. 
* Loc. cit. pp. 538, 56, 57. 
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The ordinates of these two curves are shown in the last two columns of Table IT; 
the agreement between them is practically exact. The present example would 
appear to indicate, therefore, that there will in practice be little difference between 
the conclusions suggested by the two theories. Nevertheless, once it has been 
assumed that the components of the index are normally distributed, it seems 
definitely preferable to use the index-distribution (24), and its probability integral 
(17), implied in this assumption of normality. We thereby retain a consistent 
mathematical theory, by avoiding the extraneous assumption that the index- 
distribution can be represented by one of the Pearsonian curves; moreover, 
calculating the ordinates and frequencies for Geary’s approximation to (24) is 
considerably less laborious than doing so for a Type EV curve. 


It will be observed that while we get a fairly satisfactory fit of the theoretical 
to the observed index-distribution, the latter is more leptokurtic than the former. 
It is therefore to be expected that the fit will be improved, if we assume that 7) (L) 
and P:(Z) are distributed in a curtailed normal surface such as (27), so that the 
distribution of the index becomes of the type (34). Actually, however, we find that 
for the outlying individual for which 7; (1) = 92, P2(L) = 134, 


=\2 : a =\2 
L—wt a= & os . aaa b> yy 
( ) - 95 J4(¥ ”) = 157947, 
Cx ox Oy Ty 
so that unless we are to reject this individual as a pathological anomaly, we must 
take the \? of the limiting contour (28) at least as large as 16. The following table 
indicates the sort of value that F, the integral factor in J (A) (equation 34), assumes, 
if we adopt this value for \?: 
v: 625 635 ‘645 655 ‘695 735 175 
2F: ‘961 ‘986 ‘994 997 9998 99993 -99995 
vy: 955 945 ‘935 ‘925 ‘885 845 805 
2F: 980 ‘989 ‘994 ‘996 ‘9994 ‘99986 99994 


These values make it clear that there will not be any significant difference between 
the ordinates deduced from (34) and those shown in Table IT; in other words, there 
will be no appreciable improvement in the fit. 


I have to thank Professor Pearson for his advice at several points in the course 
of this paper, and for lending me the manuscript of an unpublished lecture by him 
on Dr Merrill’s work *. 


* [The purpose of the lecture referred to lay in pointing out from a number of examples, whose 
components were even more nearly normal than those of Woo’s case, that the §,’s of index-distributions 
were very considerably in excess of their theoretical value as deduced from Merrill’s process. 
Mr Fieller shows that Merrill’s method and his own lead to results in fair accordance both in the dis- 
tribution of frequency and in the values of the constants. Assuming the theoretical values of both methods 
to give 8, =°10 and 8,=3:1, the standard error of 8, for a sample size 787 is ‘2395. Thus the observed 
8. has a deviation from the theoretical value of 2°18 times its standard error, and roughly this would 
indicate a probability of less than ‘02 of the observed 8, being due to a random sampling from the 
theoretical population. Thus the difficulty discussed in the lecture is emphasised rather than sur- 
mounted by a method which gives constants in accordance with Merrill’s results. Ep.]} 





























A NOTE ON THE DISTRIBUTION OF THE 
CORRELATION RATIO. 


By JOHN WISHART, M.A., D.Sc., Clare College, Cambridge. 


Introduction. 


THE sampling distribution of the correlation ratio may now be said to have 
been determined for three different cases, in all of which the arrays of a variable, y, 
are normally distributed with a common variance in the population sampled. 
Suppose that we reserve the symbol 7 for the population value and denote by E? 
the square of the correlation ratio calculated from a sample. That is 


E* => {ni (Fi — HP} /[E(y — YP} .......censcecececeee ted (1). 
1 


The number of arrays is p, the ith array having n; observations; 9; is the mean of 
the observations in the ith array, and 7 is the general mean. The first summation 
is over all arrays, while the second is for all the observations of the sample, 
i.e. from 1 to NV, where V = 3 (n;). 
1 
Then it is known that if »? be zero*, the distribution of E? is 
{ 9\)?! 
<< CaF 5 (Et (1 — EM dE) ......(2), 
1s (My — Z)}! js (me— 2Z)j}! 
a form which shows the identity of the form with a general class of distributicas 
having symmetry in n; and ne, interpreted in this case as the numbers of degrees 
of freedom between and within arrays, so that 
m=p—l1, n=N-—p. 
If 7? be not zero, two separate cases arise, for both of which solutions can be 
derived from the two distributions of the multiple correlation coefficient given by 
R. A. Fisher. 

Case (a). Here it is to be supposed that the conditions of sampling are such 
that the array totals, n;, vary from sample to sample. The sampling distribution 
is then given by Fisher's series (A), writing 

R=, pPp=7, m=p-l, m=N-p, 
provided that the expectations of y for the values of # in the sampled population 
are normally distributed. This distribution has been studied at some length by 
Fisher in the paper cited, and in particular the probability integral for ng even 
* R. A. Fisher: Journ. Roy. Stat. Soc. Vol. uxxxv. 1922, p. 605. The distribution was also deduced 
at a later date by Hotelling: Proc. Nat. Acad. Sc. Vol. x1. 1925, pp. 657—662. 
+ R. A, Fisher: Proc. Roy. Soc. A, 121, 1928, pp. 654—673. 
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was expressed in finite terms. As it was thought that the mean and second moment 
coefticient of this distribution were of some interest in themselves, these quantities 
were later determined for ng even, and the results inferred to hold generally *. 
Such results can readily be translated into terms of the correlation ratio. 


Case (b). Of more practical interest, however, is the case where the number n; 
in each array is supposed the same for all samples. The distribution of E? is then 
that of R? in Fisher’s distribution (C), writing 


N 14 
a, ae ee, | ae a 
R=k*, t#=5 1-2 "=P 1, ne=N-—p. 


Fisher did not study the properties of distribution (C), and the object of the 
present paper is to obtain the probability integral in finite terms, and in addition 
expressions for the mean and second moment coefficients, developing the distribu- 
tion as that of Z*, the square of the correlation ratio, although the results can 
readily be translated in terms of any variate following the same law of distribution. 


Further, both Fisher’s (A) and (C) distributions were shown to tend in the 
limit as the size of the sample was increased to a third distribution (B), and it 
will be shown how the results derived in this paper, equally with those previously 
obtained from the (A) distribution, tend in the limit to the corresponding para- 
meters of the distribution (B). For this limiting form we have 


B=ner?, B=n,k*. 
The (C) Distribution. 


Changing the notation as explained above, and denoting $7 by a, $e by b, 
E* by «x and }? by ¢ to simplify the mathematics, the distribution of «# takes the 
form 

(a+b—1)! 


lf = e-t 
ii (a—1)!(b-1)! 


wat | l ea avy! 


x | Tye +5 1) + (o+0)(a+0+! (tx)? + ee ae (3). 
l'a 2!a(at+1) 
Since », and ny, are necessarily whole numbers, a and b may be integers or half 
integers, but the factorial sign is used in either case, i.e. #! denotes what is 
generally understood by [' (@+1). The series within square brackets is a confluent 
hypergeometric one, and may be denoted by F(a + b, a, ta). Now by an application 
of Kummer’s formula*, we find 


F(at+ b, a, ta) = et F(- b, a, —tzx), 
giving, when b is an integer, a terminating series of the form 


b b(b—1) 


I wv r+ .... 
7 ;* seen’ * 


L!¢ 


* J. Wishart: Biometrika, Vol. xxm. 1931, pp. 353—361. 
| Whittaker and Watson: Modern Analysis, § 16.11 (2nd edn. p. 332). 
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At this point we shall suppose 6 to be an integer, returning later to a consideration 
of the other case, when b is a half-integer. Noting that 


d® . (a+b-1)! 
=e a+b—1 iz) — g2-1 ptz a — tr 
da \* e) anki 22-1 etx F'(— b, a, — ta), 
the distribution (3) may be written in the comparatively simple form 
e* : 
= ——___— (] —axpP-? FM (a) dx ....0.....ccccessscceses 4), 
df Git -s (a) da (4) 


where f(x) =a*-1 e, and f(x) denotes the bth differential coefficient of f(x) 
with respect to a. 


Probability Integral. 

In this form it is easy to evaluate the indefinite or probability integral of the 
distribution. The range in (4) is from 0 to 1, and since f (0) = 0 for all values of 
r from 0 to b—1, the integral of (4) from 0 to # may be written down directly. 
In fact, 


[= df 
0 
=e = ets § f(x) 
r=0 r! z 


a+6—1)!*: 1 ( qgut-i-r l —ax) os 
a= > \— fi 52 -F(—r,a+b—-r,- tx) iene (5), 

@G-2) o (vn! (a +b-—1-r)! ) 
involving a series which terminates in $b(b +1) elementary terms. 


Now by Taylor’s theorem 
oR. 
f(e+hy= % — f(a). 
. raor! 
Put h=1-—~., and we see that (5) involves the first b terms of an expansion in 
Taylor’s series, of which the complete series is 
tf (a+1l—2) =f(1 )= et, 
When, therefore, we are interested in the “ tail” of the distribution curve, as when 
we wish to find a value of x for which the proportionate area under the curve 
beyond the ordinate at « is 0°02 or 0°01, say, we may write the probability integral 
in the form 
2 (1 —zy - y 
f=1-e'*: eS ii) () eee re yn (6). 
r=b r < 
If it were desired to extend Woo’s table* for values of »* other than zero, it 
would be necessary to solve equations for « of the form 
2 (1-2) 
> nat 3 g” (x) = 0°02 and 0°01, 


y=b ae | 


a da" 
where g' (a) = e-* f (N(e)= a (440-1 e-t 1—-2)), 
. da 


* T.L. Woo: Biometrika, Vol. xx1. 1929, pp.1—66. Tables for Statisticians and Biometricians, Part I, 
1931, pp. 16—72. A table of the author’s, in Quart. Journ. Roy. Met. Soc. Vol. uiv, 1928, pp. 258—259, 
gives the 0:05 and 0°01 levels of significance, and extends to 7 arrays and a size of sample of about 100. 
It covers a range below 50, which is not covered by Woo’s table. 
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For given size of sample and number of arrays, and for a given value of 7°, this 
would give an « (or H®) beyond which there is only a 1 in 50 or a 1 in 100 chance 
of a value occurring in samples from a population with this value of 7%. At best, 
however, this computation will be a long and laborious business, for it does not 
appear that any essential simplification is possible in the expression of our results 
(5) and (6). 

For the special case of samples from uncorrelated data, Woo’s tables provide, 
in addition to the approximate 0°01 and 0°02 probability levels of significance, the 
mean value and standard deviation of the square of the sample correlation ratio 
(our « or £*). We shall proceed now to determine these quantities in the general 
case, still on the assumption that b is an integer. 


Mean value of E?. 
Beginning with the form (4) of the distribution, let us multiply by # and 


integrate from 0 to 1. We have 


ae 2a l b-1 df fO-1) (py 


li »-1 40-10 () |" 
= ’ — 7 Pp—- (b— . 
4 lan pi7G _ J (| 


fl gt 
7 |. (b-1 


on integrating by parts, 


vk 1 —«x) —(b—1) a (1 — x)-*} f&-» (2) da, 


l p—t 
=— [ rT : i vie — af -—(b-1)a(1 —x)-*} f@- (a) da, 
Jo\o~— : 


since the term between limits vanishes. Continuing this process we have finally 


iad “| —t 
RR=| © ¢-(b-1)(b-1)! (1-2) + (6-1)! a} f" (a) de 
Jo (b-1)! E 
ri 
=et|! {1—b(1—a)} d (att ete) 
~0 
rl 
= 1 —be-* | iil cd | | ee Ee eae eS Ter R a BES An ee ey ON (7) 


- 0 
on further integration by parts. An examination of the integral in (7) shows that 
when ¢=0 we have 
b a m p—-l 


k*=1- = = => ; 
a+b a+b m+ng NI 


agreeing with the result deduced directly from the special form (2) of the distri- 
bution when »*=0. On the other hand when ¢ becomes very large the important 
part of the integrand is e, whose integral from 0 to 1 is (et—1)/t, so that the 
second member of (7) behaves like b(1 —e~*)/t, which tends to zero as ¢ tends to 


infinity. Thus when »* = 1, we have #*= 1. 


The integral in (7) may be evaluated in series form in two ways, according to 
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whether ¢ is less or greater than a+. In the first case, expanding the exponential 
and integrating term by term, we have 


’ Ree ; gath gator i? gator 1 
gate-l ptedy = |——_ 4 t oe 
een = sg +4 ei Sathana” 


/0 0 


: F(at+b, a+6+1, t) 
t+b * 


et 
a+b 


F(1,a+6+1, -#) 


by Kummer’s formula, F being the confluent hypergeometric series as already 
defined. Thus, in general, 


7 b , 
EK =1-7 pf. e+b+1, —8 Shewcaeewest sameeeee (8). 


The series F is of the form 
t + 
- + -— ee 
at+b+1 (a+6+1)(a+b+4+2) 

and can be readily evaluated when a+6+1 is large compared with ¢. When ¢ is 
large, however, or when a+ is small whatever the value of ¢ be, it will be found 
best to transform (8), a process which is best carried out by returning to the 
integral in (7) and integrating by parts. We have: 


et e a+b-I1f! 
é a+b—1 1 pie) a . ae b—2 pix la: 
|, —* t t |, a 
et a+b—-1 (a+b—1)(a+6-2) (— 1)*-1(a+b-1)! 
ont foes MET lt nh 
| t 7 i + fatv-1 (1 € | 


when (a+) is an integer. When (a +) isa half-integer, the other possibility, the 
corresponding form is 


et | 14 +b-—1 ,(a+b-)NG@ 4 b—2)_ (—1)*”-3(a+b-1)...3 
t | t e ie {ato-3 
(—1)#?-?(a+b-—1  F 
a ae so | at ettda., 


70 
Putting « = wu? in the final integral, the series becomes 

ef a+b-—1 (a+b-—1)(a+6b-2) 

: ies aioe sats ase 


(— 1)¢-3 eo) ee | a ri 2 
+ = is ) 7) (1 boa e-t ev du) , 


0 
We therefore have 


t t " 
a l a+o—1 l —_ l )! ° 
| +‘ ae a-e| (a+) an integer ...... (9), 
— 1)e-1 —1)...3 ors 
e= or P(1—et] edu } (a+b) a half-integer ...(10). 
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The series in (9) and (10) consist of a finite number of terms, from which the 
value of HE? may be readily computed. When ¢ is large compared with (a + b), it 
may happen that the terms become negligible before the last, or remainder term 
is reached. If, however, this last term has to be computed, there is no difficulty 


with (9), but with (10) we have to consider methods of evaluating the integral. | 
Let ¢(1 — u?)=a. Then 
1 ré —2 a 
[Tetawrdy = 2 | ee (11) 
- 0 2t 0 /; x 
as | 


If we now expand (i —f t)-3 in powers of xt, we get 


re z ae 1.3.5...@r—1) f° | 
e*dax + xe" dx +... + . ; ate*dx+... 
i (2t) Jo r! (2t)" Jo 





2¢ 


1 ] 
= 2 <a 
35 [11s 8+ cay 0 » e)+...+ 


L.3.6:....@r=)} 


where y (7+ 1, ¢) is written for the incomplete gamma integral 


t 
[ ate*da. 
~0 


The (7 +1)th term of the series in square brackets may be written 
(r— 4)! 


Vor be Mle t), 
wT. % 





us/ pt 
or in terms of the Incomplete Gamma-Function I (u, p)= 
70 


rt © ((y—1)! 
et | &du= : = r= a)! r( ia , r)} 
0 ot NV or r=0 ( t? Vr +1 ) 


, 
uP e*du/p ! we 


have 


which can be evaluated from the Tables of the Incomplete T-function*. 
Since the complete integral J (#, 7) is equal to unity, we see that the integral 
in (11) becomes 
Ul +0 (1/t)], 
which tends to zero as ¢ tends to infinity. This can also be inferred directly from 


(11), for in the second integral the important part of the integrand when ¢ is large 
is e~*, and the integral therefore tends to resemble 
(1 —e~*)/(2t), 
which tends to zero as ¢ tends to infinity. 
The value of (11) in direct powers of ¢ is the uniformly convergent series 
t” 


i+ @) 
ete ore 7 Pa 2 =F CL$ -4, 
r=0 . - 


* H.M. Stationery Office, 1922. 
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but as computation would only be feasible from this series with ¢ of the order of 
unity or lower, it would be better to go back to a direct application of (8), especially 
when it is remembered that small samples are usually of little value, and values 
of (a + b) of less than 20 to 25 will be of little practical importance. 


Equation (9) is, of course, a direct consequence of the application to the 
confluent hypergeometric series in (8) of a well-known formula 


(y—1)! _.{, @(a-y+l). a(a+1)(e—-y+1)\(e—y+2) ; 
F = - —x)*11— — A 
ibaa S| pea ;. s 2122 
(y- oot f s (1—@)(y—@) + qd —92—-#)G—e) et), 
(a—1)! { x 2! a 
where in our case @=1, y=a+b+1,2=—t. This relation, however, breaks 


down in our case when (a + 6) is a half-integer owing to the second series becoming 
imaginary. Certain tables of the confluent hypergeometric series have been com- 
puted by Airey*, but they are of no use to us in the present investigation, being 
only calculated for values of our (a+6+41) equal to 4, 1, 14, 2, 3 and 4, and for 
positive values of « (i.e. negative values of £). 


Second Moment Coefficient of E*. 


Returning now to the form (4) of the distribution, we shall multiply by a* and 
integrate from 0 to 1. We find 


rl pt 
py’ (E*) =: a? (1 —aj-1d§ fe (2z)} 
y(b—1)! ; 


Ll ¢«t 
=— : {2a (1 — «1 — (b—1) a2 (1 — x)-?} f° (x) dx 
Jo (b—1)! i 
‘ et (9 / 1 jo-1 4 l 1) 1 b—2 
= ora “& —_ = x —Z ™ 
Jo (b-1)!' bain Sie 
+(b—1)(6— 2) a*(1— 24} f?® (x) dx 
on integrating by parts. The part taken between 0 and 1 vanishes at these limits 
in both cases. Continuing the process, we finally obtain 
“Ey)=[ = a@-1O-2)¢ 
wg (H*) = {4(b—1)(b—2)(b-1)!(1—aP 
. ‘9 (b—1)!'* 
-2(b—1)(b—1)!a(1—2)4+(b-1)! a} f "(x) dx 
pl 
| et {4 (b—1)(b— 2)—0(b—1) a 4+ $0(b + 1) a} d(atte®) 


“0 


Il 


| 
| 


3 (b-— 1)(b—2)-—b(b—- l)e-* rd 


1 *1 
+b(b—lj)e gttl-l etz dx 


0 
1 


“1 
+4b(b+1)e* | antec —b(b+1e*| at ett dx 
J ~ v0 


e 


1 1 
=1+b(b-l)e | yett-l ete dy —b(b+1)e—* at ettda  ......0s. (12). 


0 -t 


* J. R. Airey: British Association, Reports for 1926 and 1927. 
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If in (12) we put t=0, it becomes 





‘Or b(b—1) 5(64+1) a(a+1) 
2) _ wah iiss Pa - 
pa (H") = 1+ a+b a+b+1 (a+b)(a+6+4+1) 
ny (m1 + 2) p-l 








~ (a+ Ne) (my +Ne+2) N?-1’ 

agreeing with the direct evaluation of the uncorrected second moment coefficient 
from the special form (2) of the distribution when 7?=0. On the other hand, when 
t is infinite the same considerations as were taken into account in the determina- 
tion of the mean value of H* show that the second and third members of (12) 
vanish, and we have yu’ (H?)=1 when 7? =1. 

Evaluating the integrals in (12) by expansion of e and integrating term by 
term, we obtain 





: b(b—1) b(b+1) 
*(E2) =1 (1a -j-- =F 2, —t) ...(18). 
pie’ (Et?) + oa% F(1,a+b+1,-—-t) atba 1 i atb+2, t) ...(18) 
Now 
o* 2 = py (4?) —(E?P 
1 ; 
=b641)| FO, a+b+1,—1)- 555 Pat b42,-0| 
_ - 
~@rp (1,a+6+1, -2)} | 
L 2 
ee} F(2, a+b + 2 —8)—(1 — BBP..........cccceseeses (14). | 


be: (a +b)(a+b +1) 

This may also, if desired, be expressed as 
ti b(6+1) aF ia & BB 
a+b dt 

where F represents the series F(1,a+b+1, -#), ie. the same series that occurs 
in the expression (8) for the mean value of Z*. This series will have been calculated 
in any case to obtain HL, either directly or by means of the forms (9) or (10), and 
if a table were to be prepared of the function F it should be accompanied, for 
completeness, by a table of its derivative, which could either be calculated directly 





from the confluent hypergeometric series in (14) or in terms of the differences of 
the table of F. Direct calculation by (14) involves computation from the series 


% | 3 | 
a+b+2° (a+b4+2)(a+b4+3) °” 
and will not be feasible if ¢ is large compared with (a + b). If this is the case, the | 


simplest way to get the appropriate form for 0,2 is to differentiate the series in 
(9) or (10) with respect to ¢. We have 
: at+h—2 sh agi a 7 
putt le (a+b—1)...(a+b—r) , (a+b Wy _ety| 
t race” (—t)’ (— t)e- 
(a+b) an integer, 


_a+ b Pog, 2 (a+b—1)...(a+b wey ese apn [‘ ewan) 
t na’ (— t)’ (—t)t-3 


(a+b) a half-integer. 
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Whence 
oF _ _sibf, Ere EMerh-Y-nleth—9), tb, 4, 6H] 
dt rE =z (—0 (—t)=2 t 
(a+) an integer ...... (15), 
at at+b [ 2 s'C+ DG +b-1)...(a+b-r) 
e r=1 (-—t)" 
et it - ( + at) c*['ewaul | (a+b) a half-integer ...(16). 


Case when b is a half-integer. 

So far the results for the probability integral, and the mean value and variance 
of H*, from the distribution (3) have only been proved for the case when b is an 
integer. The series F (— b, a, — tx) was then a terminating one, and it was possible 
to express the distribution in terms of the bth differential coefficient, with regard 
to x, of the function f(«) = a#-te”, When 6 is a half-integer, the other possibility, 
it will be found convenient to use the theory of non-integral differentiation*. The 
appropriate theory may be briefly outlined as follows: Beginning with the function 
J (x), let us consider the operation of integrating it repeatedly between the limits 
0 and x. The first integral is j T(&) dé We have then 

) 


( 


" ) Fé) dé = i dy 7 (&) dE 


. 0 


II 


\r (&) dé | dy 


| w- Os © ak 


Repeating the operation, we have 


ray 3 fx ry 
(| ) f (&) a = | dy | (y — &) 7 (&) dé 
} (a —y)dy 


= | T(E) dé 





= [Oo FOde 


fay n (ae — &)n 1 ‘ 
In general (| ) f (&)dé= | ~_. f (&) dé. 
Ff - 0 : J0 (n-1)! 
Putting n =}, we have 
(7\t (* f(&)d 
([) re@ag= | EO. 
J0 Jona (a — &) 
This expression may be regarded as the definition of the $th order integral of 
the function f («). 


* T am indebted to Professor Norbet Wiener for this suggestion and for references to the literature 
of the subject. 
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To apply this result let f(a) = 2*+*-1e, as before. Then we may write 
qd’ . qo+t f° gov-l ett 
| 0o”vV 


— f(x)= > AE | = f+ (7) ......... 0.0 17), 
aes‘) dab++ ime (a — g) e| t (x) (17) 


where fA (x) denotes the function within square brackets. 


Returning now to (4), the distribution of « may be written, formally at least, as 


df = — Dp: — Pt Gtd) (a) dw ......ccccccccscctecs (18) 


and b + } is now an integer. 


Nature of the function h(x) and its derivatives. 


Putting v = &/x in the integral of (17), we have 


1 fz ‘a+b—1 Ltt gath-t rl 
| i dé =" in | yttd-1(] — y)-tetrdy, 
Vor 0oVva—é& Vir .0 
; +b6—-—1)! 
ie. h (x) =" : ; panto F(a +b, a+b6+4, tx) 
(a+b—1):! Poe y 1 ‘ ¢ 
at | 7s bet P(4,a+b0+4+4, —ta) ............ (19), 


by Kummer’s formula, F' denoting as before the confluent hypergeometric series. 
In the first of the two forms given (19) may be differentiated repeatedly without 
much difficulty, and we have the results 


+b-1)! oe 
h’ (a) aL” = a ere a 4G, 4B — ©, Ge) cock cceckcccaccsaves (20), 
(a+ 5b — 3)! = 
and in general 
(a+b—-—1)! 


h®) (x) = ax b— i = ny! ae 4-1 F (a + b, at b+ 4 _—f, ta) 


(a+b—1)! 
= gtt-t-r ett (4 —7, a +b64+4-—7, —ta)...... 21). 
ey oe e* H'(4 2 (21) 
Putting r= b + 4 and substituting in (18), we get immediately the form (3) of the 
distribution. An alternative form for h(a), useful for computation purposes when 
t is large, may be obtained by putting # — & = u?/2¢ in the integral leading to (19). 
We then have 


/2tx (1 Ta \@ +b—1 


a 
h(x) = ee qyat—l etx = dix) et? du. 





20 
On expanding (1 — u?/2ta#)*+-! by the binomial theorem, and integrating term 
by term, we have the result 

Z 


h(x) = — at-1 ete | mo 2tx) = 
Vt * 


a+6b— 
2tax 


1.3.(a+6—-—1)(a+ 0-2) x 
z >. (a Te) ‘ 1g (NV Zh) — 20. | o.eee (22), 


1 = 
Me (V 2tx) 
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where m,,(2tz) denotes the (2r)th Incomplete Normal Moment Function 











sei rv/2tx 
Mg, (WV 2twx) = 5 | ure du | (2r —1)(2r—3)...1 
27 J0 
= /2tx ng 
P V 2ta) = =| —}u2 ns 
Mo (V 2ta) rg et he 


mo may therefore be obtained from Sheppard’s tables*, and the even m’s required 
are tabulated up to mj.*. The upper limit of m,,(2tx), when ¢ becomes infinite, 
is in all cases 0°5. 

Probability Integral. 

Returning now to (18) the probability integral is 

e~ t 
= —1)!J 

This may be integrated scale by parts, putting h@+) (x) dx=d {h@-)(z)}, 
etc., and we find 


¥ “(1 = PAHO (2) de. 


b-3} i . . 

oe rae + (a) 42 | 

r=1 ==" Valo Vl—2@ 
=e (  gatb-i-r(] — g)r-t : : ae 
da-2) 2, (at+b—$-r)!(r—-})! sini bach liste <r 


p>—-t fx j olan > 

€ a’ (x) da 2 

a. RA (23), 

VaJo Vl—2x 

on substituting for h(x) from (21). This is the analogous form to our result (5), 
and may be completed by evaluating the integral in (23) 


h' (x) may be written, from (20), 


h’ (x) = = \((at+b4+r— - ra idtewadl 
rao (at+b4+r—3)Irl j 
bales: 


st a a ‘yar ra ) 
ma . v = — p yee 2: A hes bs =A gtto+r-3 (1 — x)? da} 
= oVi-aw WVarr=0 egies s — )!riJo ) 
ow ) 
ae > ~Le(atbtr—$, t) 


r=0 ) 
where J,,(p, q) is the Pa Beta function 
»9+q-1)! ° 
(p+¢- 1)! I xP-1(1 — x) edz. 
(p-1)!(q-1)!. 
The required integral can be expressed, therefore, in a simple series of incomplete 
Beta functions, which can be obtained for small values of (@ + 6) from the tables 
prepared in the Galton Laboratory, and now at press}. For the special case of 


* Tables for Statisticians, Part I, pp. 2 and 23; Part IT, p. 147. 
+ See also Biometrika, Vol. xx1. pp. o74 ~283, and Tables for Statisticians, Part II, 1931, Table XXV. 
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p or q equal to 4, the incomplete Beta function can be expressed in terms of the 
symmetrical integral as follows: 


I,(p, })=1—216(2p—1), 


7 


0 2 
where Ie(2p—1) -| cost-10.d0 / 2 [ coste-40d6) ‘ 
0 -0 ) 


and @=cos-!Vz. The best method of evaluating I»(2p—1), in the absence of 
tables, is one I gave some years ago* in terms of a series of powers of 1/(2p — 1). 


g—t 1 hp’ Fy} lax = r 
When «= 1, J,(pq)=1, and therefore ae | ad. — =¢et > (‘ : = a 
VarJo V1—zx r=0 WT: 


Mean value of E*. 


Taking the form (18) of the distribution, let us multiply by 2 and integrate 
from 0 to 1. The procedure is exactly similar to that in the earlier part of the 
paper, where b was considered an integer, and the part taken between the limits 
0 and 1 on integrating by parts vanishes in every case. Finally we are left with 


et 


‘1 
= GT iy —(b—4)(b-—1)(b—-2)...8(1-—a)-@-D 


i +(b-1)(b-~-2)...42(1 —x)-@)} h’ (x) da 
=e | Pas aS = - \ h’ (a) dx 
Jo | § Vor Var (1—2)) 
‘lQbaxh’ (a) dx) 
Var Vo Vl-«@ . acs 3 
e-t ‘1 h’ ( vc) da Qhe-t ‘1 


| V1 -ah' (2) da 


Ver'oVl—@ Vor .0 





e* (1 ~ (96 —] yh’ (a) da 
| + 


Qhe-* f! 
=1- ——| Vi-ah' («)de 
Vir -0 


bet 2 (atb+r—1)! tt" [1 : 
d€ S { a+b+7 | gardir-3(] wae from (20) 
VW r o(atb+r—3)ir! 0 
x r 
=l1-—be* > — ne 
r=0 7 (a+b4+r) 

bet ’ 
=]— F(a+b,a+6+1, #). 
a+b 


, " 6 . 
Finally EB? =1-- 7 FU. a+b+1,~2). 


Thus our result (8) for the mean value of Z* is shown to hold in general for all 
values, integral and half-integral, of b. The alternative forms (9) and (10), for 
large t, follow as a matter of course. 


* Biometrika, Vol. xvu. 1925, pp. 469—472. See Tables for Statisticians, Part II, 1931, pp. ccxxi 
and 239. Note that this formula is applicable generally for any value of N (our 2p—1) and should be 
regarded as superseding the former method and Table XLV. 
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Second Moment Coefficient of E*. 

We repeat the former procedure of multiplying by 2? and integrating succes- 
sively by parts, stopping when we reach an integral involving h’(«). The parts 
between the limits vanish, and we have 
ps (E)= 7 = | (6-1) 6- YO- 1) O-2)... GA — ayo 
gies —2(b—})(6-1)(b-2)... §a(1 —zp-@-» 

+(b—1)(b—2)... $27 (1 — xpP-@} h’ (x) dx 


et 
(@-1) 


rl me _— al 2 —_ »)-t 
=¢ | er » (b-3§ ) ¢! —_ a) aa: = —* 3) x ( 1 —Z t ee a -e h’ (x) dx 
igs 2-o°-NVor 4 Vor V7 ) 
= “| ((b— 4) (6—8)-—2b(b-—})x+b(b4+l1)a ik (om 
oNVa7. Vl-2x 


et rl fy’ r) 1 ‘ te 277 
= ag | Di den 4) V1l—avh' (x) dx+ 4b(b4+1)) (l1—a}h'(#) ae. 
Vor Lo % , 0 Jo 
The first and second of these integrals have already been evaluated, while the 
third may be obtained directly from (20) by a suitable modification of the pro- 


cedure already outlined. We then have 


20 t” t” 
4! v2 ~~ = s 4% L p—t < 
fe’ (E*) ve co ri(at+b+r) Otre r-9 (a@+b4+7r)(a+b4+r4+1) 
2b b(b+1) 
= an p—t fr L l 9 — b 2, ¢ 
l ory © F(at+b,a+6+1 ”) + aeb(atb41)” F(at+b, a+0+ ) 
2% b(b+1) 
mi ~——.. Fri. 6 4641. -6 F (2, b+ 2, —?). 
a+b ta at * * @4+b)(at+b4+)) ahaa 


This, though not of precisely the same form as (13), is seen to be equivalent to it 


when we derive o*,, for subtracting the square of the mean 


20 s/t b 19 
rat F(i,a+641, —t)+ —F2(1,a+6+1,— 
a+b (a+bP 


we see that 
b(b+1) 


a (at+b\(atb +1) 


which agrees exactly with (14). This result, therefore, holds generally for the 


F(2.a+64+2, —t)-—(1- FY 


variance of #* whether 6 is an integer or a half-integer. 

Limiting values of Mean and Second Moment of E°. 

In conclusion it will be of interest to show the identity of the limiting value 
of the expressions (8) and (14) with the corresponding moments of Fisher’s dis- 
tribution (C’)*. To establish this we require to find the limits to which the moments 


of bE* tend as b is made indefinitely large. 


* = / a\—} ( t / at1l\-.. ‘1\) 
Now H*=1- (1 +7) ‘g — 5) + i ) + i +0(5)t from (8) 
a+ ( 2 ts l 
= tS + + O(;s) so be tbadiinaanewe neh san teen (24). 


* Proc. Roy. Soc. A, 121, 1928, p. 663. 
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Hence Or 844 8 ee (25). 


b>o 
Also 


ele) OS) EOS) +e o065) 


— E*? from (14) 











=1-7 G4) , Seto rer +0(5)- ay 
Rut (1 — Bip a1 — 79 4 SOTEEM LO (F) from (24). 
Therefore o 2 = e +0 (53) ; 
and Be Ma te OY DE, ocniinintebrenescceeignieigees (26). 


b>ax 
Now consider the (C) distribution of Fisher. Writing $§*=a, $8?=t and 
4m =a, it takes the form 
‘ get (tx) (ta)* - 
df = eet 1 + + REE 2” RE ee 27), 
“  (a=1)! a 2!a(a+1) ( 
which we may put in the form 
a-l 
> a — 
df = ; e~*-t J, 1(2V at) da, 
where J denotes the Bessel Function of imaginary argument. 
The range of x in (27) is from 0 to #, and the integral over this range is unity. 


The moment generating function M is therefore defined by 
ioe) 
M -| e df, 
0 


since the coefficient of h*/r! in the expansion of this expression is p,’ (a), the rth 
moment coefficient geese the origin. 


mn 4 ( (ta) (tx ? 
s \ = e-t p—a(1—h) J past 
rhus M= [ 2 aay e i+ 2 +z Salat) dx, 
a-—1 
or M= | (7) 2 e-te-za-W) Toa (2V at) da. 
0 


Let us now change the variable, writing #(1—h)=2' and at the same time 
writing ¢’ for t/(1—h) in order to keep the series part in its present form. We 
then have 


eth/1—h o0 gial a! ta” r , 
lar cad os a iad + sre aa | 
ghd (0 pf t= ae. pie 
~ (1— hy j (=) © et [q_4(2Va't’) da’ 
eth[i—n 
Typ oer eereeeeesentandnnnneenepecconennansonennsnnnaneesstesin (28), 


since the integral is now of the same form as (27), 
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Further, if we write 


hs 


; 2 
K = log M = xyh + a5 +3 54-5 


on expanding in powers of h, then x, is the rth semi-invariant of the distribution 
of x. x, is equal to py’, while 
Ke=-Pe, Xs= ps, 
ka = wa — Bye? = ps? (Ba — 3), 
and so on. 
We have therefore only to take the logarithm of (28) and find the term in h*/r! 


in its expansion. 
th 


Now K= pa ee log (1 — h), 
and the term in h”/r! in this is simply 
Rey OSS GHG EP cascecexiass coneecocaounat (29). 
This is the general rth semi-invariant of the distribution (27), and is a generalisa- 
tion of the Type III result a(7—1)! which holds when ¢ is zero, as is otherwise 
evident from inspection of (27)*. 


In particular the mean value of z, ie. of Lt (b#*), is obtained from «,(«) by 


b—->a@ 
putting r=1, and is a+¢ (see (25)), while the second moment coefficient, o* 
(identical with x2(a)), is equal to a + 2¢ (see (26)). This establishes the relations 
sought. 


We have noted already that Fisher’s (A) distribution tends, like (C), to the 
common limit (B) as nz increases without limit. We are not concerned here with 
the properties of the (A) distribution, but it is perhaps relevant to show how the 
results of the previous papert+ on the mean and variance of the (A) distribution 
also tend, in the limit, to the values (25) and (26) just deduced. In terms of the 
correlation ratio we have 


< bn? _F (1, 1,a+b+2, 9) 


atte: ee ee 


from equation (14) of the former paper. 


Now write by? = 8% =¢ in our notation, and we have 


=a a\ ¢ a+1\ t 1 
Pat (1+F) +5(1+-) {1+ j2+0(;s)} 


_at+t @+(a+l1)t *) 30 
a 3 +0(s iesiuacpeincneyedel eer (| 
Hence Lt bE2=a+t. 
b—>o 


* We have here another case of a Bessel Function distribution, the law for the semi-invariants of 
which is even simpler than that of McKay, given in Biometrika, Vol. xx1v. 1932, pp. 39—44. 
+ J. Wishart: Biometrika, Vol. xx1. 1931, pp. 353—361. 
29—2 
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Further, from equation (20) of the former paper we have 
bb+1(1- 9h 
(a+ b)(a+b+1) 


=(14+5)(1-5) @+5) (14°F) [i+ p+ e@)t-a-BF 


= 1-248) , Sees eet +0(5) 


) 
a = 


F (2, 2, a+b +2, n?)-—(1— E?P 


b b* 
(1 — E?} is obtained from (30), and we have 
™ a+ 2t 1 
o*,2 = Re + O(j). 


and Lt o%,72 = a + 2t. 


b>ao 


In both cases the results are identical with those already deduced from the (C) 
distribution. 


Summary. 


Beginning with a statement of the nature of the distributions, for three distinct 
cases, of the square of the multiple correlation ratio in samples from a normal 
population, the paper goes on to consider in detail the third of these, namely that 
appropriate to the case where the array totals are supposed the same for all arrays. 
Expressions are reached for the probability integral of the distribution, and for the 
mean value and variance of the square of the sample correlation ratio. It is shown 


finally that the mean and variance tend, in common with the analogous results for 
the other general distribution previously studied, to the corresponding parameters 
of Fisher’s limiting distribution (B), as the size of the sample is increased without 
limit, and the general semi-invariant of this limiting distribution is given. 



































ON THE PROBABILITY THAT TWO INDEPENDENT 
DISTRIBUTIONS OF FREQUENCY ARE REALLY 
SAMPLES FROM THE SAME PARENT 
POPULATION. 


By KARL PEARSON. 


1. Ler there be v categories in either sample and suppose that the parent 
population has the same v categories, and that the chance of drawing an individual 
from the sth category of the parent population is p,, where s may be 1, 2, 3... v. 
Let the category contents of the two samples of sizes N and N’ be respectively 

ee Se ee ee 

and ny’, Ne’, Ns’, eee a. eee Ne. 
Then I proved in a paper published in 1911*, that if 
s=0 NN’ 2 


—_ 


- q ‘n Ns g 
‘= = ot Bice re (2), 
v= SNe WW) p, 
the frequency distribution of y* would be given by 
Y = Yoe- ax" (Ry?yalO 3) [d (4x°)] ecececcecccescceseees (ii), 


and P, the probability of y? not falling short of a given value, would be found by 
entering the (x’, P) tables under that value of x” and with n’ =v. 

If the parent population be known, or we are testing whether the two samples 
are likely to have been drawn from a hypothetical population, the problem is 
perfectly straightforward, because the series p, will be given; and the answer may 
be found by fairly easy arithmetical work. When, however, the parent population 
is unknown, and the question put to us is—Are the two samples likely to have been 
drawn from some unspecified parent population?—the answer is not so easy to 
provide. 

Unfortunately the answer I gave in 1911 was not the correct one. I wrote: 

“ Now the best hypothesis as to the constitution of this [the parent] population, 
on the assumption that both frequencies are random samples of it, will be that its 
sth frequency class is that indicated by the combined two samplest 

In other words, I suggest that p, should be taken equal to (n, +,’)/(NV +N’). 
This was not a bad suggestion?, if the samples were considerable in size, but it is in 
no way the “best” hypothesis. If we adopt it, then 

y= —_ i a. Sa... | eee (iii), 
AVIV 5=1 Ne + “hy 

* Biometrika, Vol. vit, pp. 250—254, + loc. cit. p. 252. 

t This use of the sample values for the unknown parent population values is so usual in the theory 
of errors that it frequently escapes comment ; it is really only legitimate in the cases of large samples, 
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and the table of the two samples can be arranged as a biserial contingency table, 
and it has unfortunately come to be spoken of as such. The true form (i) cannot 
be represented as a contingency table, and (iii) will, as a rule, not give a con- 
tingency table for any other pair of samples which help to make up the complement 
of x*’s involved in P. It has led to many students forgetting what the n,+7,’ 
stands for, i.e. an approximation more or less adequate for the unknown (N + N’) p,. 


If we try to think over what the words “best hypothesis” mean in this matter, 
ought we not to interpret them as signifying that hypothesis as to the p,’s which 
will give the highest probability of the two samples being draw from the same 
population? Surely, if we are asking whether the two samples are likely to have 
been drawn from some unknown parent population, we ought to choose for that 
unknown population the one that makes the probability P of their common 
sampledom a maximum, or the value of x? as small as possible. 

Now it is quite possible to determine this system of p,’s. Of course when found 
they may contradict some other experience we have had. But here in this problem 
we are supposed to start with no past experience, i.e. with quite unknown p,’s. Had 
we some previous experience of the latter we should have to discover not “the 


most likely” but what I have termed “the most reasonable” values of the p, 
series *, 


Proceeding to the determination of the most likely values of the p, series, we 
have to make 


tn» Stee) 
=v 8 (F ,) 


s=1 Ps’ 
where v= VN’/(N + N’) a minimum, subject to the conditions that: 


s=v 


(a) S (p)=1, 
s=1 
and 
(b) the values obtained for the p,’s are possible as probabilities, i.e. they must 


all be positive and less than unity. 


Following the usual method with an indeterminate multiplier 2: 


s=v in, n,’\? 1 
O=-» S (m-™) 2p, 
. ¢ *) Ps Ps 


s=v 
0= S (&p,), 
s=1 
or we obtain the series of equations 

7. 9 

hae 

vic ) /p-+aA=—0. 
(F N’) / Ps 


Multiply by p, and sum, we find 
min. so rA=0. 


* On the determination of the ‘‘most likely” and ‘‘most reasonable” values of the constants of a 
parent population see J'ables for Statisticians and Biometricians, Part II, pp. elxxi et seq. 
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Hence 
vy (NM, 17; \* 
ps = T eS (r- 3) . 
X min. \4 N 


Taking the square root of this we have a plus and minus root, and by the nature 
of p, we must take the positive result. In other words 


— / v (F «) 
; Vv win. N N’ ; 
the quantity in brackets being given the positive sign. 


Sum the result just obtained and we have 


yp 807m, nn! 
— /. S (~~ i) 


\ Tale. s=1 \NV N’ 


as m re ‘) 
- . = S = 
x min yi (FO N’ 


Thus 


al 
to 
ju 
e 
= 
~ 


and 





These values satisfy all the requirements of the problem. p, is always positive and 
less than unity, and g (p;)=1. Further, the x* obtained is a minimum and not a 


=1 
maximum because we can choose the p, series so that y* can be as large as we 
please. 


Accordingly we have chosen a parent population which gives us the best chance 
that the two samples are drawn from the same parent population. 
We will now illustrate this numerically. 
2. The following data have been extracted by R. A. Fisher* from Tocher’s 
Scottish returns for the children of a certain locality : 
TABLE I. 
Hair Colour. 





5 im ie eee Tee | 
| Fair | Red | Medium| Dark Jet Black ] Totals | 
| | 

| Boys 592 119 | 849 | 504 | 36 2100 
| Girls 544 | 97 677 | 451 14 783 | 
| =. | 
| | | | | 
Totals | 1136 | 216 526 | 955 | 50 3883 | 

| | | 





* Statistical Methods for Research Workers, 3rd ed. 


~ 
=] 
gD 
o 
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Here NW = 2100, V’ = 1783 and the per tional ib iis are: 








Fair Red Medium Dark | Jet Black | Totals 


iB 
| *404,2857 *240,0000 | -017,1428 | 1°000,0000 
Giris (F) 2.9445 | | °007,8519 | 1-000,0000 


| "281,9048 | *056,6667 
| -305,1038 | *054,4027 sal haa *379,6971 
| 

| 




















Sp ee = eee 
, | ix 1 oi 
- “, | ~:023,1990 | -002,2640 | 24,5886 | —-012,9445 | -009,2909 sai 
j 
Further : NN'|(N +N’) =v = 964-2802. 


Everything is now ready for substitution in (i), when we have selected our pg series. 
Fisher takes for his p, series the values (n,+,’)/(N +N’), which are etenen by 
dividing the third row of figures by its total 3883. Let these be termed p,”s. We have 
pr’ ='292,5573, pe’ ="055,6271, ps’ ="392,9951, py’ =*245,9439, p;’= --0128766 


If now we square the last line of the table above, divide each square by its appropriate 
ps from (vi), add and multiply the result by v, we find 
x? = 10°4674. 

Fisher gives practically an equivalent y?= 10°468. By our method we have five 
categories—no question arises of degrees of freedom—and we look out the (x*, P) 
table under y? = 10°4674 and n’=5 and find P='034, This agrees with Fisher 
who says the value of P lies between ‘02 and ‘05. He concludes “that the sex 
difference in the classification by hair colours is probably significant as judged by 
this district alone.” 

We ask, however, whether another parent population could not be found con- 
tradicting this result, 

Turning back to the table at the top of the page we add without regard to sign 
its third row of figures and find for its total 


s=5 Ne n ’ 
S(t w ™ ) =-072,2870, 
S(y~ i) 
whence by (v) we have 
pr =°320,9291, po='031,3196, ps =°340,1524, py='179,0709, ps ="128,5280 


Now here are another series of p’s, which will not like the p,’s in (vi) lie between 
the values found for the two samples. But have we any reason to suppose the 
parent population must have relative frequencies lying between those of the two 
samples? All we can say, if we are in complete ignorance of the parent population, 
and are determining “the most likely” values of its proportional frequencies, that we 
ought to use (vii) rather than (vi). We can either use (i) as we have already squared 
the items of the third row of the table above, or more briefly use (iv), we have 
X nin. = 9642802 (072,2870)? = 50388, 





aaa 
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a value less than half that given when we use the combined samples’ value 

(ns +,)/(N + N’) 
to determine p,. This value of y? leads to P =-284, from which we should conclude 
that it would be quite reasonable to suppose no sexual difference in regard to hair 
colour of boys and girls in this particular locality. 


This illustration is not given to prove that there is no sexual difference in hair 
colour, but as a warning against placing too great a trust in the x? method of 
estimating the difference of two samples by the spurious contingency method, i.e. 
the method which without thinking of the parent population, and the nature of the 
assumption p,=(ns+,’)/(N +N’) uses (ii) as an invariably safe test. There are 
clearly numerous parent populations between (vi) and (vii) which would admit of the 
two samples being reasonably considered to have arisen from the same population. 


3. We can extend the conception of the first section of this paper to a more 


general problem. Suppose we have a population involving two characteristics A and 

B classified into u and v categories respectively, and let the chance of an individual 

being drawn from the sth category of A be p, and from the fth category of B be 

q- Further, let the chance of an individual being drawn combining both character- 

istics be a,,, where a@,, will not be equal to p,xq,; unless the characteristics are 
st st j lt 

independent. Now we can represent this population in the form of the tabie 


By B, on B, ake. B, | 
A, 43, Qs, wee Q4;, A Q1y, pr 
As a1, G20, ves Coz, ae Gey, ps 
Ay Qs, ean aa Gee, co | ee Ds 
Au Gui, | ua; oe Mea rere in: Pu 
jn qa “es dt make dv | J 


Now suppose a sample of size NV be drawn from the above population as parent 
population, and let the distribution in the w x v cells be represented by the scheme 


below: 
ni Nig eee Ni Ni» ny. 
Ney Nee eee Nae eee Ney |} Reg, 
am | 
ta | Me |lfe | Re Ms | Msp Ns. 
Nut Nu — ee ae = Ruy | Muy. 


ny | Re ieee ee, + Ny» N 
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Then the mean square contingency of such a sample is defined as 


i= 





=u t=v ( J, 2 =<“ t=v/ 
1 7 ‘S" (Mg — Nog)? _ ] s=ut "( Res ) 1 
ar * iq a ie 
N s=1 f=1 Nas N s=1 t=1 Na,” 


Similarly, if x? be taken as V¢?, we have 


oe ae Pa re 
¥V+N=S 8S (j ) IOLA UTNE ORR OD (viii). 


; 
s=1 ¢t=1 Nees: 


Now if there be no correlation between the variates in the parent population, i.e. 
@_ = Ps qt, then on certain assumptions with regard to the size of ny, x” and 
therefore ¢?/N as thus defined will be distributed according to the law 


paw 8 I isos ccakaticceisavesvennseeis (ix), 


for the ny’s being independent, it does not matter whether we arrange them in the 
form of the above table, or in a single row or column. 


Now, suppose we have no knowledge whatever of the parent population, then 
what are the best values to give the unknown @,;'s? 
s=u t=v 
Clearly the only relation binding on their choice is S S (ay) =1. 
s=1 t=1 


Let us find the minimum value of y* subject to this condition; we have dropping 
the double summation sign for brevity 


je Ng OA 
0=-3("), 
\ N Ase 
0 = S (da,,). 
Using an indeterminate multiplier \, we have 
9 
Nt” 


~ hn Fhe ®, 
Na,2 * 


for all values of s and ¢. 


Multiplying by @,, and summing 
("er or r=Y tI 
A Na =A, OI _ x min. + iV. 
adits 
Ns Ns 
Further: t.= =, 
VNXN VN Vx in. +N 
and accordingly : 
é 
Nise r,/. 3 2 
S ( ) VN V xX min. 7 N = x min. + N, 


Ning 


S ( —— ) - VN = Vy nin, - N, 


VN 
Ne 
; 2 = ¢ Pes. 
- x min. ~~ 0, and ey = V . 
from which 
t= 4 
i net 
= S (¢2g)=— and q=-—. 
” (1 N 1 
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Thus the “best” form to give the parent populazion margins, i.e. that which makes 
x” a minimum for this sample, are frequencies proportional to those of the sample 
itself. In this case 





ee ae 
= 
Now in order that (ix) may hold, the @,, series must be supposed constant through- 
out the sampling, i.e. we are to suppose n,, and n_, remain the same for all further 
samples and only n,, to vary. Thus the successive samples cannot be arranged as 
contingency tables. For if we change n,, and n., with each sample we are making 
a new parent population with each sample, and the samples cannot then be supposed 
drawn from the same parent population *. 
We now reach a case of which much use has been made, but which I think 
needs very careful handling. 
As in the previous section we have 
> 
N(1+¢)=x°+N=8(7# ) ct eo eee (xi) 
Nas: 


and we desire to determine whether there is independence in certain results, 


which are assumed (as hypothesis) to be sampled from a population of zero 
contingency. 

We can best illustrate this by an example given by Dr R. A. Fisher}, dealing 
with Wachter’s data for back-crosses in mice. He gives a fourfold table running 
as follows: 


TABLE IL. 





Black Self | Black Piebald | Brown Self| Brown Piebald | Totals 
| ea 
‘ ae F, Males 88 82 75 60 305 
| Coupling {7 Females 38 34 30 21 123 
ietiiinben F, Males 115 93 80 130 418 
sitesi F, Females 96 88 95 79 358 
| 
| Totals 337 297 280 | 290 1204 








Now neglecting the marginal totals the problem seems to be: Could the 16 cell 
frequencies have arisen from sampling a parent population with no contingency ?— 
Thus @,, =p, x q. But we have taken of the four categories four samples of the 
sizes 305, 123, 418, 358. Are we going to confine our attention to such distributions 


* There is nothing to prevent ¢? with n,. and n., varying from sample to sample being used as a 
statistical coefficient, but in that case x? is not N¢®, if we mean by x? that which is distributed by the 
law of equations (ii) or (ix). Result (x) seems to justify the usual expression for ¢? as a measure of the 
departure from independence: (ny=n,. n/N, oF X2nin.=0), When we have no knowledge of the parent 
population, 

+ Statistical Methods for Research Workers, 3rd ed. p. 86. 
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as arise when we repeatedly make samples of these same sizes? Apparently wé are 
to do so and this though somewhat artificial could be carried out, if with difficulty. 
We thus reach what I have termed a coefficient of partial contingency* with four 
linear equations of condition among the n,,’s. This will reduce the n’ of our (x*, P) 
table by three, as that table takes account of the total size of the sample 1204. So 
far so good. But now we come to the horizontal marginal totals. These must vary 
from experiment to experiment; what reason is there for treating a, ; od . 
1204’ 1204 
cad and ani as the values of qi, ge, gs and qq in the hypothetical parent popula- 
tion of no contingency from which we suppose our four samples extracted? Out of 
all possible repetitions of the four series of crossings, those giving the horizontal 
marginal totals coinciding with those of these actually performed experiments in 
the several categories of mice must be of the highest rarity and we should find it 
practically impossible to obtain such sets. It would seem that in choosing the 
horizontal marginal totals as the values of the q;’s we are really repeating what was 
done in the biserial table, ie. assuming that as we do not know the q’s we shall do 
the “best” we can by supposing they agree with those of the observed frequencies. 
If we assume that all further experiments are to give these same marginal fre- 
quencies, we again limit by three more linear relations our contingency, or, in 
looking up P from x? we must reduce n’ by six, or enter the table with n’ =10. 
The problem we are then answering is this: If we made further quadruple experi- 
ments each having the same number of mice from each form of crossing, and each 
quadruple experiment giving precisely the same numbers of Black Self, Black 
Piebald, Brown Self, and Brown Piebald mice, in how many cases might (on the 
basis of independence) at least as great a value of x* be expected? But this 
further limitation is unnecessary, if all we have assumed is a system of likely values 
for the q;’s, and suppose successive further samples not to give the same horizontal 
marginal totals. Dr Fisher appears to prefer the extreme limitation to 9 degrees 
of freedom. This forces his further samples into a partial contingency table form, 
with all the marginal frequencies identical with those of the actual sample. 


Not only will the x? and therefore the P, as I have shewn+, be dependent on 
the number of mice resulting from each cross, for example, be altered if we had 209 
instead of 418 “ Repulsion, F; Males,” but in practice the repetition of the coat 
colour distribution in further quadruple experiments is unattainable. It seems 
awkward in applied science to say something will occur, if so and so be done, when 
the doing of the latter is practically impossible. 


Some, if not all, the difficulty may be surmounted, if we turn back to our value 


2 os 7) wet ( Net ) 
ys + N S (Fe : Nps 5] 


* «On the General Theory of Multiple Contingency, with Special Reference to Partial Contingency,” 
Biometrika, Vol. x1. pp. 145—158; see in particular p. 146. 
+ Biometrika, Vol. xxtv. pp. 302-303, footnote. 


of x*, namely: 









































Kar. PEARSON 465 


since the parent population is assumed by hypothesis to have independence—and 
ask, supposing the p,’s to be fixed, what are the “best” values to give to the q;,’s on 
the basis of the observed results?—Our answer is as before that the greatest 
probability for the observed results on the basis of an independent parent population 
will be obtained by choosing the q;’s so as to make y* a minimum. 

We proceed to make 





4 2a . 
2-9 g(t )- OR ee es i 
x 8 8 (aa, N (xil) 


a minimum subject to the condition that 


S (4) = 1. 


We have with \ as indeterminate multiplier 


s (5) ee 
s \Np,/ a? 


or multiplying up by g, and summing for ¢: 


a zi N =S (Aq: )= rn. 
t 


uJ Ibs 4 
- LW.) 


2 


“ X min 5 N : 
and it follows on substituting for q; in (xii) that 


Hence: Ve 


ene’ Is Sy (‘tT ela ces ated (xiii) 
min N i le Ps ) ) > 
(y nai)! 
and that “= aoe ts Janininn Vea weesede waGeeenaanen (xiv), 
gly | Nei ) 2 
t ls \Ps/ 


or q; 1s always < 1 and S (q) = 1. 
t 


Applying these results to our example, we proceed first to square all the terms 
ng, and take the reciprocals of the p,’s. Thus we find proceeding down a column 
9 
Nese : ; : : 
S ( = ) and then take the square roots of these expressions. We obtain: 
, 
} 


Vg 


/ Y Ns” — 337-39 /s Mss? — 998: 99 
\ S( ~ ) = 387-3310, \ S (“S) = 2080192, 


s P s 
/ eS en /s kee 
/8 =)- 282-4913, 4/8 ) = 2969776, 
s s & s 


and accordingly : ee 
s,/'s (“) = 12148191, 
Vv Ps r 
» (12148191)? |, ae 
my = nn 20 =? . 3: : 
ya min. 1204 1 ( 4 ] io 4 
and: 
91 = °277,6800, gg=°245,3198, gqs=°232,5378, qq= "2444624, 
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Dr Fisher obtains for x? the value 
Xo" = 21°832, 
working his table as an ordinary contingency table and his 4q;’s, which are his 
horizontal marginal totals divided by his V (= 1204), will be 
91 = '279,9003, ge=°246,6777, g3=°232,5582, qy='240,8638. 

The reader will say: “It is true your x? is less than Dr Fisher’s, but your q,’s are 
so close to his and both your ys differ only by an insignificant difference, that it 
is not clear why an attempt should be made to improve on them*.” But there is 
really a wide difference between the two methods of approach! Suppose we knew or 
guessed the p,’s and q;’s of the parent population. Then we should use Equation (xi) 
to compute x? and we should enter the (y’, P) table with n’ =16, because there 
would not (beside the size of the sample) be any restrictions whatever on the 
freedom of our sample. By fixing the p,’s, because there is no “natural” size for 
the relative numbers of matings we may make artificially, we have reduced our 
conclusions, whatever they may be, to apply only to a repetition of experiments of 
these sizes. We have destroyed the possibility of a general law; we cannot assert 
that for other sizes of samples, we should deduce the same conclusion. We have 
reached a conclusion for a narrower universe by sacrificing three degrees of freedom. 

But at any rate let us attempt to reach a conclusion for a broader universe by not 
sacrificing further degrees of freedom! If we make the coat-colour distributions to 
be the same for every set of quadruple types of matings, our conclusions will apply 
only to an absolutely restricted and practically irreproducible universe! But how 
van we avoid this? Only by assuming some set of q;’s for the hypothetical parent 
population. How may we do this? There are two obvious ways: (a) assume that 
the experiments give a good approximation to the required q;’s, or (b) find the 
most likely q;’s by the method of this paper, i.e. those which make the probability 
of the observed result a maximum. In either case we do not further restrict the 
degrees of freedom. Further quadruple samples will not have the horizontal marginal 
totals the same as those of the observed sample—i.e. will not take the form of con- 
tingency tables. What then has Dr Fisher done, when he reduces his degrees of 
freedom by still a further three? He has picked out of all the possible samples 
those which have their distributions the same for the coat colours. His conclusions 
therefore only hold for that extraordinarily limited universe. 

t is of interest to note that in this particular case—it is far from being so in 
every case—the observed values of the coat-colour distribution are strikingly like 
the “best” values and lead to the same value practically of x**. 

If we took out the P that corresponds to that x* with n’=16 —3=13, we find 
from (a) P =:040, and from (6) P= °041. 

If we limit our n’ to 10 = 16 —6, we find P=-010. Now what does this signify ? 
It denotes that with the narrower proposition when we experiment in such a manner 

* The reason for the closeness of the two sets of q,’s is that our sample being large the horizontal 


margin total distribution gives nearly the same series of q;,’s as the set which produces the minimum 
value of x?. 
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as always to get the same relative numbers of coat colours(!) we may predict that 
the four series (P =-01) are not homogeneous, whereas we are far less certain of this 
(P =:04) when we take experiments which could be fairly easily repeated. But in 
the former case we do not know whether the departure from independence really 
lies in genetic conditions, or is due to the restraints which have been put on the 
distribution of coat colour. The effect of abolishing these restraints appears at least 
to suggest that they have contributed to the result. 

Of course the desirable thing would be to abolish all restraints except the total 
size of the sample, but this is impossible with regard to the p,’s, for their arbitrary 
character lies in the very nature of the experiment. To show to what extent the 
arbitrary choice of p,’s effects our conclusions, I will take the following table, which 
is obtained practically by doubling the number of “Coupling, F, Males” and 
halving the number of “Repulsion, F; Males.” There appears to be nothing more 
arbitrary in this than in the results of the observed mzting type proportions. 
TABLE IIL. 

















Black Self | Black Piebald | Brown Self| Brown Piebald | Totals 
s 2 Ee SEE : as Bea 
‘ | 
—_ F, Males 176 164 150 120 610 | 
) voupnns 1 F, Females 38 34 30 21 123 
| Renulsi F’, Males 58 46 40 65 209 
Kepulsion 4 F Females 96 88 95 79 358 
Totals 368 332 315 285 1300 





Following Dr Fisher, that is making the relative coat-colour distribution con- 
strained, we obtain: : 

x" = 162710 P= -062. 
Thus even with the P=-05 limit, we could not now assert that the observed departures 
from independence are not of a magnitude ascribable to chance. 


and 


This illustration points only too strongly to the caution requisite in applying 
this method to draw conclusions from observed data; the conclusion drawn will 
depend on the number of mice, and therefore on the relative number of crossings 
made in each one of the four sections of the quadruple experiment, and these are 
at the choice of the experimenter. 

If we do not limit our judgment to experiments giving always the same relative 
proportions of coat colour, then we must enter with n’=13, instead of 10. If we 
take q,’s of the parent population to be those given by the observed values, i.e. 

qi =°283,0769, ge='253,3846, gs='242,3077, qa=°291,2308, 
we have x?= 162710, and P="180, a value which quite prohibits our concluding 
that the departures from independence are not ascribable 
“best” values of the q;’s, they are: 

gi=°281,5800, ge=°'254,4918, gs =°241,9760, 
again unusually close to the observed values, 


to chance. If we use the 


ga = °221,9522, 
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We then have for the “best” x?: 
Twin. = 16°2114, 


giving P =*182, which is practically the same as we find from using the observed 
qr to represent the parent population. 


I trust this discussion has to some extent cleared up the difficulties which await 
those who use a contingency table of multiple rows to question whether the multiple 
series involved in those rows may be treated as homogeneous, i.e. possible samples 
from a common parent population. We have seen that in Dr Fisher’s approach to the 
problem two difficulties arise. The first from the arbitrary numbers of mice in each 
experimental series, and the second from the constraints enforced on the coat colours. 
Dr Fisher isreally proposing a series of experiments, each individual experiment giving 
the same numbers of mice from each type of mating as occur in the observed experi- 
ment,—this may be needful,—and further the same relative proportions of coat colours. 
The latter is not needful, and would be practically impossible to achieve. Dr Fisher 
concludes that if he could repeat the multiple experiment under these conditions 
the x? would correspond to a low probability, but there is no evidence that this 
result flows from the genetic constitution of the mice. Indeed, if we alter the 
numbers of mice from each type of mating, i.e. alter the number of matings, and 
leave in our experiments the distribution of coat colours to freely adjust themselves, 
we find y? can be so modified as to provide a probability, which is far from suggest- 
ing that the departures from independence are not of a magnitude to be ascribed 
to chance. The method therefore needs great caution in use, and there should always 
be an exact statement of what the problem supposed to be answered really is. 

It will be seen that the method of the contingency table fails in stating clearly 
what is the homogeneous population from which we are supposing the four series 
to be drawn, and, admitting the difficulty of the problem, I prefer to attack it by 
using (i) and comparing each pair of series with one another. The question is: 
What value shall we give to the p,’s for each pair? It seems very much better not 
to use the observed sum of the columns, but to adopt for p, the value given in (v) 
and accordingly for x? its minimum value in (iv). If we use these we have six pairs 
of series to compare, and owing to the simplicity of (iv), the arithmetical work is no 
more laborious than that of finding x? from a 4 x 4 table. 

We take the reciprocals of the marginal totals column and by aid of them 
determine the relative frequencies of each row. Thus we get the table: 


TABLE IV. 


Series n/N No |N n/N n,|N Pe 
A °288,5246 "268,8525 *245,9017 196,7212 305 
B *308,9431 ‘276,4228 | °243,9024 *170,7317 123 


| 
| 

| @€ ‘275.1196 ‘222.4880 | -191,3876 -311,0048 | 418 
na. 268.1564 | ‘2458101 :265,3631 20,6704 | 
| 
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We now take the differences between the entries in each pair of rows, add these 


differences and squaring their sum multiply by the corresponding value of 
v= NN’'/(N+N’'), where N and N’ are given in the last column. 


For example, taking A and B, the differences, regardless of sign, are 
020.4185, °007,5703, -001,9993, -025,9895, 
giving the sum ‘055,9776, and its square -0031,3349, 
v = 305 x 123/(305 + 123) = 87-6519, 
and thus x? = 87°6519 x 00313349 = 2747. 
The P of course is to be looked up under the number of categories, i.e. n’ = 4. 
Proceeding in this way, we find 
Aand B: x*?=0:2747, P=945 
AandC: y?=9:2122, P=-027 
AandD: y*?=1:2414, P="746 ; 
Band C: x*=7 , P=059 
Band D: y?=1°8668, P=-"603 
Cand D: y*?=73023, P=-064 
By the P =-02 criterion, none of these are significant; by the P =-05, the A 
and C differences are. But we see at once that while the series A, B and D 
might be considered as samples from the same population, the probability that 
any one of them and C can be considered as such is of a much lower order. 


Accordingly we take the sum of A, B and D and test it against C. Thus we have 
the relative frequencies : 


ee (xv). 





Series n,|N no|N n/N n/N N 
A+B+D 2824427 -259,5420 2544529 -203,5624 786 
C 275,1196 ‘2224880 -191,83876 °311,0048 418 
giving the differences regardless of sign : 
007,382381 °037,0540 °063,0653 °107,4424 
with a total of ‘214,8848 and square '0461,7548, this with y= 272°8804 leads to 
x? = 12°6004 and P = 006. 

The advantages of this process are that Table IV enables us to make any 
analysis we please of the material as we proceed, and that while (xv) is not 
definite, it indicates the line on which we can get a perfectly definite result. 
Finally we are certain that whatever parent population we may take for a pair, 
that chosen is the one which makes the observed result most probable ; in other 
words if definite heterogeneity may be predicted on the result thus obtained, it 
would certainly be predicted on any other assumption of a parental population 
distribution. We see that if A+ B+D and C be supposed to be two samples 
from the same parent population at a maximum whatever that population might 
be, two such samples could not arise more than 6 times in 1000 trials. 
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The method is straightforward, the arithmetic simple, and we take P out of our 
(x*, P) table with n’ equal to the number of categories in the series. 


This method of approaching the problem is of course not free, any more than 
the contingency table process, from the variation in y*—and therefore in P—pro- 
duced by the artificial choice of numbers of matings and the resulting numbers 
of offspring. This difficulty is introduced by the factor y= NN’/(N +N’). Supposing 
we keep the relative percentages of coat coiour the same in the two series as well 
as the total number of mice, the maximum value of v, for A + B + D as compared 
with C, will be (NV + N’)=301, in our case leading te y? = 13°8988 and P = ‘003, 
which makes some difference in P, but not in our conclusion*. The relative size 
of the samples appears in such a simple form when we proceed from the biserial 
method, that it is fairly easy to appreciate their influence. Failing any “natural” 
distribution of the totals in the sub-experiments, it would perhaps be the simplest 
rule for the research worker to keep them as near an equality as possible. 


* The choice of the ratio N: N’ may easily make a difference in our conclusions. Thus if in A we 
had had 514 mice and in C 209, giving the same total 723, we should have had y=148-58368 instead 
of 176°33472, and, if the colour percentages in each series had remained much the same, we should have 
x?=7°7625 instead of 9°2122 and P=-133 instead of the -027 of (xv). The number of mice in the 
C group (Repulsion, F, Males) is the largest of all four sub-experiments, and we must be very cautious 
to allow for this when, on the basis of the x? test, we attribute to C a genetic differentiation from A, 
B and D. 
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1. Introductory Remarks. The theory of small samples has been developed to 

a large extent from problems involving a single variate. The extension of the theory 
to samples from multivariate populations has been made rather slowly and it is far 
from complete at present. It was not until 1928 that Wishart+ found the simul- 
taneous sampling distribution of the variances and covariances in samples from 
a multivariate normal universe, whereas Fishert solved the problem for a bivariate 
normal population in 1915. In the same paper, Fisher found the distribution of the 
correlation coefficient and in 1928 he solved the corresponding problem forthe multiple 
correlation coefficient§. The distribution introduced by “Student” || in 1908 in his 
analysis of the ratio of the deviation of the mean of a sample from that of the 
population to the standard deviation of the sample was more rigorously obtained by 
Fisher™ in 1925. At the same time Fisher extended its application to sampling 
fluctuations of regression coefficients, differences of means and other problems which 

* Presented to the American Mathematical Society, March 26, 1932. 

+ J. ‘Wishart: ‘‘The generalized Product Moment Dist:‘bution in Samples from a normal multi- 
variate Population,” Biometrika, Vol. xx4, (1928), pp. 32—52. 

t R. A. Fisher: ‘‘Frequency Distribution of the Values of the Correlation Coefficient in Samples 
from an indefinitely large Population,” Biometrika, Vol. x. (1915), pp. 507—521. 

§ R. A. Fisher: “The general sampling Distribution of the Multiple Correlation Coefficient.” Pro- 
ceedings of the Royal Society of London, Series A, Vol. cxxt. (1928), pp. 654—673. 

|| ‘*Student”: ‘*The probable Error of a Mean,” Biometrika, Vol. v1. (1908—1909), pp. 1—25. 

| R. A. Fisher: ‘‘ Applications of ‘Student’s’ distribution,” Metron, Vol. v. No. 3 (1925), pp. 90—104. 
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involve essentially a single variate. These ideas were generalized in 1931 by 
Hotelling* who found the distribution of a quantity 7 which, when divided by the 
square root of the number of degrees of freedom, is a natural extension of “Student's” 
original z to a sample from a multivariate normal population. We find very few 
additional extensions of the kind with which we are concerned existing in the 
literature. 


Statistical coefficients which have not been adequately generalized for samples 
from a multivariate population include the variance, ratio of variances, correlation 
ratio, standard error of estimate when all variates are drawn at random, and certain 
maximum likelihood test criteria developed for one-variable problems by Pearson 
and Neyman+. As early as 1876 Helmert{ found the distribution of the sum of 
squares of deviations of a set of normally and independently distributed quantities 
from the population mean, and in 1900 Karl Pearson§ solved the same problem for 
the case where there is correlation among the variates and found the distribution of 
x. In 1908 “Student” || suggested the form of the distribution of the sum of squares 
of the deviations of the variates of a sample from the sample mean, which was 
verified in 1915 by Fisher*|. By means of the distribution of the ratio of two in- 
dependently distributed variances, Fisher** has found the distribution of the multiple 
correlation coefficient and the correlation ratio in samples from normal populations 
in which these quantities are zero. He has extended the use of this distribution to 
the problem of testing for the significance of variations in certain subvariances into 
which the variance of a sample can be analyzed and has developed the theory of 
intraclass correlations. Romanovsky ++ introduced an extension of the ratio of variances 
and found the sampling distribution of a quantity H which is the average of the 
ratios of variances for two samples from a multivariate population. But this does 
not seem to be a perfectly natural extension of the variance problem for samples 
from multivariate populations as we shall see later. In 1928 E. S. Pearson and 


* H. Hotelling: ‘‘The Generalization of ‘Student’s’ Ratio,” Annals of Mathematical Statistics, 
Vol. 11. (1931), pp. 359—8378. 

+ J. Neyman and E, §. Pearson: ‘‘On the Use and Interpretation of certain test Criteria for purposes 
of statistical Inference,” Biometrika, Parts 1. and 1. Vol. xx4, pp. 175—240, 263—294. 

J. Neyman and E. 8. Pearson: ‘On the Problem of k Samples,” Bulletin de VAcadémie Polonaise des 
Sciences et des Lettres, Série A, Sciences mathématiques, 1931, pp. 460—481. 

t C. F. Holmert: “Uber die Wahrscheinlichkeit der Potenzsummen der Beobachtungsfehler und 
iiber einige 1amit in Zusammenhang stehende Fragen,” Zeitschrift fiir Mathematik und Physik, Vol. xxt. 
(1876), pp. 192—219. 

§ K. Pearson: “On the Criterion that a given set of Deviations from the probable in the case of 
correlated Veriables is such that can be reasonably supposed to have arisen from Random Sampling,” 
Philosophical Magazine, 5th series, Vol. x. (1900), pp. 157—175. 

“Student”: loc. cit. [Helmert also in 1876 proved the equation for the distribution of the sums of 
the squares of the deviations about the sample mean. See Astronomische Nachrichten, Bd. uxxxvut. 
8. 122, or Biometrika, Vol. xx. pp. 416—418. Ep.] 

‘| R. A. Fisher: Biometrika, Vol. x. (1915), p. 507. 

** R. A. Fisher: “On a Distribution yielding the Error Functions of several well-known Statistics,” 
Proceedings of the International Mathematical Congress, Toronto (1924), Vol. 11. pp. 805—813. 

tt V. Romanovsky: ‘‘On the Criteria that two given Samples belong to the same Normal Popula- 
tion,” Metron, Vol, vit, No. 3 (1928), pp. 3—46, 
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Neyman* began a series of papers in which they adopted the principle of maximum 
likelihood as a means of obtaining criteria for testing various hypotheses in statis- 
tical inference. Among others they have obtained criteria appropriate to the 
hypotheses that two or more samples are drawn from the same normal population ; 
that a sample is drawn from a population with a specified mean; that two or more 
samples from populations having identical variances come from populations with 
identical means and a similar hypothesis stated by interchanging variances and 
means. They have thus far confined their work primarily to samples from normal 
populations of a single variable. 


Investigators dealing with samples of two or more correlated variables are 
confronted with the need of extended forms of the above statistical mechanisms. 
For example, measurements of several anthropological characters are obtained on 
two or more groups of men; how can we test the hypothesis that they are from the 
same race by a consideration of their variances and covariances? Similar problems 
arise in psychology concerning certain mental tendencies of two or more groups of 
individals who have been measured on the basis of several mental traits; and so on 
for other fields of statistical investigation. 


In this paper it is the purpose of the author to find the moments and distribu- 
tions of some of the foregoing statistical coefficients generalized for samples from 
a multivariate normal population and to exhibit a method of attack which seems 
to be novel in its application. Another problem which will be considered concerns 
the moments and distributions of the determinants and certain ratios of determinants 
of correlation coefficients in multivariate samples, from which a certain generaliza- 
tion of the multiple correlation coefficient is obtained. 


2. Solutions of two integral Equations. The moments of the class of statistical 
coefficients which we shall consider are of a form which is a constant multiple of 
a ratio of products of gamma functions. Most of the distributions can be derived 
from the solutions of two types of integral equations. We shall designate these two 
types by (A) and (B) and find their solutions before considering the main part of 
the problem. 


Type A. The first to be considered is of the form 


= ae T(aqt+k)T(ae+hk)... l(a +k) 
ok 9 ae pe es Bias Beat Baas ee cD Reet 
|. "3 J (2) de B r (ay) (ag) ey 5 (a@,) 


where & and the a’s are real and positive and B and f(z) are independent of &. 
f a 
By definition, C(a;+k)=] O8it*e-*%dd,. 
~0 
Hence, as far as the moments are concerned, the problem of finding f(z) is equivalent 
* J. Neyman and E., S. Pearson: Biometrika, Vol. xx4. Parts 1. and . pp. 175—240. 


J. Neyman and E. 8S. Pearson: Bulletin de Vv Académie Polonaise des Sciences et des Lettres, Série A, 
Sciences mathématiques, 1930 and 1931. 
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to that of finding the distribution of the product z= BO, 62... 6,, where 0; has the 
distribution 
1 a;—1 2—9; a = 9 
l(a) OMe db;, (¢= 1,2, ... 2). 


Letting 0, = 


, and substituting in 


BOO... On 





n 1 
Il —— 6,ie-%d6,, 
i=1 I’ (a;) 
we have for the distribution of z, 
f(z)= 4 Bo sogte—! ae |” [> ? 6" ta~1 gh—ea—t lial 
° TP (ay) P (ag)... T (an) Jo . ae , i ce 
pe An eee Rae Bee 
xe om Fon d0,d0, eee d0,,-1 eeceee (1). 
By making the transformation 0,62 ...0;=v; ((=1, 2, ... n—1), we can write 
f(z) ae te Bengt? |” the |" sere yee yr ae-l 
I’ (a1) (ae)... (aa) Jo J0 Jo . n—t 
e Vo Un—1 Zz 
=o—2....- 2! ; 
xe " Go-t Ben-t dy, dg... WUqna .0ceeeess (2). 


The author has succeeded in integrating this expression only for special values of 
the a’s and small values of n, which will be considered later. 


We note that the integral in (A) exists for all positive values of & and hence all 
functions satisfying the integral equation (A) must have their kth moments identical 
(k=0, 1, 2, ...). The uniqueness of the continuous solution (2) can be established 
by the use of Stekloff’s* application of the theory of closure to the problem of 
moments. 


Since we are dealing with non-negative functions, it is to be noted that if we 
had not known the range of z in (A), it could be argued from Stekloff’s theory that 
it must be from 0 to «. This type of argument is especially important in establishing 
the range of statistical coefficients which we shall consider. 


Type B. Next we shall consider the integral equation 


B 1 \T , 1/ . 
wkg (w) dw = CB a bith )I "(bs 7 k) - FO,+8) 
a C(ce, +k) 0 (cot+hk)... U (ey +h) 
I*(c,) T (ce) ... U(e,) 
yhere C= ———— mg Dt )) are independe "k; where als 
where P (ly) T (ba)... P(b,) ud Band g(w) are independent of k; where also 
the b’s and c’s are two sets of real and positive numbers such that there exists at 
least one way of pairing them such that each b is less than its corresponding c. Thus 
we assume without loss of generality that b;<¢;, ((=1, 2, ... m). 


* W. Stekloff: ‘*Quelques applications nouvelles de la théorie de fermeture au probléme de repré- 
sentation approchée des fonctions et au probléme des moments,” Mémoire de lV’ Académie Impériale des 
Sciences de St Pétershourg, Vol. xxxi1. No. 4, 1914. 
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Let us multiply and divide the expression on the right of (B) by II (¢;— 8). 
i=1 
Then, since 
Pr (p)T@) | : 
— Pl” Ad) = | ¢7-1(1 — ¢)—dt, 
P(p+q) Je O-9 
our problem is resolved to that of finding the distribution of the quantity w= Biyte...t, 
where the simultaneous distribution of the ¢’s is given by 
Pe 
i=1 Tl (b;) 1 (c; — 6) 
If we make the substitution 


ti ( 1— t;)%-%1 dt; sasestwooscoesestesuee (3). 


— w 
’ © Bee... hs 
in (3), we have 
rein f [ I pm band ghba-1 dna ba 
Bon ak 2 


lJ L, n—1 


’ Cy—by,—1 
x a - t)a-4-1 qd — ty)’a-'e1 . (1 — t,-1)°n—1—n-a (a -=- de ) dt, see dt,,-1 
a 


a n th (¢;) w 
where Ka ——_-*  OE—=e 
3 i=1 r (b;) T (ce; — 8) , Bite... tea 
In order to make the limits of the integrations in (4) independent of the variables, 
we shall make the following transformation 


tf=1, (¢=1,2,...n—1). 


w : 
= — v: —_ - ~ =1,2,... —1). 

- mah (1 ms... —) vs = 
As the result, we obtain 


1\Y¥r—Ba-1 
Kw! (1 ~_ 3) | 


g(w)= = . 
‘Ie Jo Jo 





: v, ee eds 


zee 1 
| Ae Oe 
0 
x qd _ V)¥n-1-Fn-1-} (1 — v9)¥n-2-3n-2-! |, a- V_—1)1- F114 


f w\ |%1-¢2 : w\ \%2-¢s 
«[1-a(1 -5)| [t= fn tem) (-%)| cen 


w ]on—1—¢§n 
x E — {vy + vg. (1 — 01) +... + Up_a. (1 — 1). (1 — 09)... (1 — Un_-2)} (1 -5)| 


B 
x dv, dv eee dvy1 eeccceces (5), 
i-—1 i-1 
where m= rey, Bi= = dy, (v=1,2,... 2). 
j=0 t j=0 , 
ee ee y w 
Since the distribution of w has the range 0 to B, we have, for B>w >0, 1- <1. 


B 


Then we can show by induction that 


capa —1)+ ree v;.(1 = v3). (1 _ Ve)... (1 — vi-1)} (1 -5)< 1 ...(6) 


for 0< 11 <1, (¢=1,2,...n—1). Therefore, the series which results from expanding 
all of the factors in (5) involving the term (6) is uniformly convergent in the v’s over 
the field of integration and can be integrated term by term, This process yields the 
distribution function g(w) which, again, is unique. 
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3. Generalization of the Variance of a Sample. Wishart* has shown that the 
simultaneous distribution of the variances and covariances of a sample of WN items 
from an n-variate normal population is given by 





Ps Set 7 
A;;| 2 ~- 3 Ajj aij ay . 
n(n—N , Vo; @ ti=t | ay | Me EE or ao en aaeeeke (7), 
(7) + T1T( ) 
(=) 2 7 
NAy 
where | A;;| is the nth order determinant of the elements A; = 5——% . Aj denotes 
2a;aj;4 


the co-factor corresponding to p, in the determinant A =| p;;| of population correla- 
tions and o; the standard deviation in the population of the ith variate. Thus, 
if X=|o;o;p;| is defined as the generalized variance of the population, then 
| Ay|=N"[2"A]. da is the product of the differentials of all of the a’s. The 
elements of the determinant |a,| are the variances and covariances from the 
sample defined as 


a= Aji => V = (Lia — Z;) (2jn = %;), (i, 9 = :. Zz; eee n), 


where Z;= * S “i, is the sample mean of the ith variate, and 2;, is the value of 
the ith variate a for the ath individual. 

We shall adopt as the generalized sample variance the determinant | a;;|. This 
quantity for n-variate samples and the ordinary variance for samples of one variate 
are similar, not only in the manner in which they enter the distribution of their 
component parts (there being only one part in the case of single variate samples), 
but in the way they arise in maximizing likelihood functions}. For example, the 


maximum of the likelihood expressed by (7) for variations of the population 
n+1 


parameters A; is Cla;| ? , and if the Z-function of the means is taken into 


n+2 
account, the maximum of the joint Z-function is proportional to |a;| 2 . In one- 
variable samples, the maximum J-functions for the two cases are proportional to 
a- and a“ respectively, where a is the ordinary variance. 
Let us denote | a,;| by & and proceed to find its kth moment M;,(&). From the 
fact that the integral of (7) over the field of possible values of the a’s is unity, we 


have (using abbreviated notation) 
n(m—1) 


n (N-i 
fe j= v _a | aij | da = =! Minn Sascecetcenehes (8). 





N-1 


| Ajj | ; 

* J. Wishart: loc. cit. 

+ In this paper we are primarily interested in various functions of the means, variances and 
covariances of a sample from a normal population. For this reason we shall express the probability of 
a sample in terms of the probability function F of its means, variances and covariances rather than in 
terms of the probabilities of the individual items of the sample. Fis, of course, a function of the 
means, variances and covariances of both the sample and the population and is the product of two 
independent functions F,, and F’,,, where F,, is the distribution function of the variances and covariances 
and F,, is that of the means. For a speciued sample (i) F, (ii) F, and (iii) F,,, may be considered as 
functions of population parameters, and will be called likelihood functions or simply L-functions of 
(i) the sample, (ii) the variances and covariances, and (iii) the means. 
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N+2k—n—2 


Then M,(&) = z ew ~~ | ay| 2 Di vessessticccsdidenaed (9), 





where G is the expression on the right of (8). We have at once the value of the 
integral in (9) by substituting N+ 2k for N in the right side of (8). Therefore 


aee ae (ae 
a+r (*; “+ k) r(* = *+h)...0 (7 * +k) 








2 2 
M,(&) = (/N-1 2 V-2 = .-.(10), 
r(- )r( 5 jf ; ) 
Tn 
where A = | A ij | = A 


Potor..ofh" 
The fact that a factor NV is concealed in each A;; does not invalidate (10), because (8) 
holds for al! positive values of o; and hence it holds when a; is replaced by 


/N+ 2k 
0; \ NV . 
Clearly, this process will absorb the increment 2k in the NV multiplier of each Aj 


and will not affect the increment of V entering at any other place in (8). The same 
result can be achieved by transforming the A’s and a’s by letting Aj = NB, and 





as =" bi Ty : 
a;;=;;/N in (8), and finding the kth moment of | which is easily found to be 


M;,(&). 
If we denote the distribution of & by D(&), we must have 








| #D@©de=- Mw ies ben oe (11), 

an integral equation of type (A). 
Therefore 
N-—n N-—n-2 
Ss 2 re re oa — Aé 

D(é)= i | ee | (Upte... Up—3) 2 € v1 Cn-2 tn-1. 0 AVe ... AUy_y 
a oe J0 J0 J0 

= ( 2 ) Jeueen (12), 


and the range of & is from 0 to o. 


If n =1, we get the well-known distribution of the variance in samples of a single 


variate 
N-1 
(5) 
9,2 N—3 N 
Zo é 
sy neers: = 3 Be Beene 12a) 
Pt. 
2 
For n= 2* 
= oe as 7S 
‘ea,* 2 * . 2 rad. _ 








D,(8)= T(V—2) 


7 eee 
rs) CS) 


* If s, and s, were two sample standard deviations and r,, the correlation coefficient, then in this 
case ¢=8,"s,2 (1 —1,,°). 
he 12 
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h Nv" 
e =, -_—__;. .. 
haan : 403709" (1 — p12") 


The author has thus far been unable to obtain D(&) explicitly for larger values 
of n. 

In practical work it may be desirable to use the nth root of the generalized 
variance, which would be the geometric mean of the variances of the n variates 
multiplied by the nth root of the determinant of correlations among the n variables. 


In this case the kth moments can be found from (10) by substituting : for k. The 


distribution of the square root of the generalized variance for bivariate samples can 
be found from (12) by setting £=#, thus obtaining 


y-~2 2 4-8 
. S A 8 
9O= K-32) — 
Again, it may be important in certain situations to take the 2nth root of the 
generalized variance, which would be the geometric mean of the standard deviations 
of the n variates multiplied by the 2nth root of the determinant of correlations. In 


, hee ae 
this case the moments can be found by substituting 5 for k. 
ss Zn 


4. Moments and Distribution of the Ratio of independent generalized Variances. 
The statistical coefficient to be considered here is a generalization of the ratio of 
two independently distributed variances whose distribution in samples of a single 
variate is used extensively by Fisher* in his analysis of variance. 

Let us denote by & and » the generalized variances in two samples from n-variate 


populations in which the generalized variances are a and 8 respectively. If we let 


E 


: =, then since £ and » are independent, the kth moment of yy can be deduced 
from (10) as 








ny 4 Cees 
My (p)= Mu (€) Me(n) = ( ) I : ~ ...(13), 


A) jai ,(M-i N-1i 
rs PCE) 
Ui N* 


where A == , B= =—, Mand N being the numbers of individuals in the samples. 
2"a 2"8 


The distribution of y can be readily derived from the distributions of & and », 
using the form given by (1). Accordingly, we have as the joint distribution of 
& and », 


M-n N-n M—n—-2 N-n-2 pe ra * oO n = m2 4 i % 
KA 2 B 2 E 2 7) & | | | (t,0;) 2 (tz 02) é eee C363) 
0 JO 0 
: = ‘ 3 a8 , Ag Bn 
- _ 01) —(t,+ 0) —... — (tn—1 + On-1) titytn-1 018... rad T1d@ ...(14), 


* R. A. Fisher: Proceedings of the International Mathematical Congress, Vol. u. Toronto (1924). 


Sie eee 


























S. S. WILKs 479 


where 
1 


a PreSrFS) 


Making the transformation 





dT= dt, dt. eee dt,-1, d® = dé, de, eee d6,,-1. 


— s; . 
-= - == 7; =i 2 eee = 
” vy, t; 1 = te 6;, (2 >=“ n 1), 





and integrating with respect to the 6’s and », we can write the distribution of w in 
the following form : 


ll 1 
F(w)= H| | [A (1 — z)(1 — 51) (1 — 89)... (1 — Spa) + Besy se... 8,4 > 
oJo Jo 





<= a—" +2 : ae 

x[a(1—si)] ? [s(1—se)} 2 ... [sal —spa)] 7% dsidse... dsy4 

a (15), 
where 
y W. M—n—2 , 
a—1l1 9 ox , an N—& 2 ] = 

H=K or (* ‘\A 7 ps —¥ ——, €= ee ge RS 
é=0 9 ; (1+) n+l 2 l+y 


Without loss of generality, we can assume that B< A. For, if A< B we can make 


the transformation s;= 1—8;, z= 1—2Z, where z= = ye and get the desired form. 


B ; 
If we denote — by e, then we can write 








1 1 ‘ Fhe 
res PCS 
0/0 0 l+y 


€81 Sq... Sy i 
— | 1 —(1—8,) (1 — 8)... (1 — 8y_1) —- ———_ P 
| ( 1) ( 2 ( 1 l+wy | 





M+N—n-3 M+N—n-4 M+N—2n-1 
x [sa ( ] = 81) ] 2 [ Se (1 — S2)| 2 ewe [Saas a- Sn—1)] - ds, ds ees dsy_1 
ooseee (16). 
From the expression in the brace in the integrand, we have 
1 = (1 = 8)(L =)... (= ya) — Sot 
<i 
1 {> 81) (1 — se)... (1 —8y_1) 
l+y 
1—s,)(1 —s9)...(1 —s,- 
and = _ Sonne Snot) < | 
l+y 


for 0<s;<1 and w>0O. Hence, the quantity in the brace can be expanded into 
a double infinite series which is uniformly convergent in the s’s in the region of 
integration, and the integrations can be performed term by term. 
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For practical purposes, however, we can find the distribution of y for n=2 by 
means of (12). Indeed, the joint distribution of £ and 7 for this case is 


Min-¢ Mx? N-2 M-4 N-4 


9 As 2 Be “2 E2 »? 
~~ —T(M—-2)Tr(v-2) ° 
y and integrating with respect to , we get 
ws 3-3 H-+4 
FW)= : te acca B, i’ ==. B3 
21 (M—2)T(N—2)(VAgy + V.B2)M+N-4 
If the two samples are from populations with a = 8, then F'(w) will only involve 
M, N and y. The condition a=£, however, does not imply that the standard 
deviations and the correlations of the two variates for the two populations are 
identical. 





ela in) |’ (17). 
E 


Setting > = 
8 


The foregoing analysis can be extended to two samples drawn from populations 
with different numbers of variates. The moments of the ratio of the two generalized 
variances can be readily inferred from (13). To consider the simplest case of this 
kind, let £ be the variance in a sample of one variate and 7 the generalized variance 
in a sample of two variates. It is reasonable to use as our statistical coefficient the 


g E 








ratio -—=0, instead of 2. The distribution of 6 is 
Vn 7 
eo ae foe = 
2° A? BS? (== je» 
F(@= sigs WaNp ce (19) 
r (=) [(N—2)(A,0+2VB) ° 


5. Ratio of independent Generalized Variance to any of its principal Minors. 
Here we shall consider the moments and distribution of the ratio of | aj| in (7) to 
any one of its principal minors of ¢th order. Without loss of generality, we can 
take as our minor the one standing in the upper left corner of |aj|, because any 
principal minor can be shifted to that position by proper interchange of rows and 
columns in the determinant accompanied by similar changes in A to maintain the 
usual correspondence between the statistical coefficients and population parameters. 
Denote the ratio 

te (t,7=01,2,...n; p,g=l,2,...8; t<n), 


by ¢. The kth moment of ¢ is 


s 


A? ? Ajj aij —-- 2k -k 
M,.(¢)= Se Je i=l lay] ? |@pq| da...(20). 


7 * I r (7>*) 
i=1 Z 


We remark that the result of integrating (7) with respect to a;,, (¢=1,2,...n) is 
to reduce it to the distribution of the variances and covariances of the first »— 1 
variates of the v-variate sample. If we integrate further with respect to 





Qj, n-1, (j=1,2,...n—1), 
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we reduce it to the case of an (n — 2)-variate population. If the process be performed 
n—t times, the distribution is reduced to the case of a ¢-variate population. By 
argument similar to thet used in deducing (10) from (9) we find 


" N—n-2 


i. N- ie 
fe wat" “|ay| 2 | Qpq| dane 
a 48 sie~i) 00-D t N-t-2 
| 2- a(x—1) 6-1) » Ss =“ eS x ~<-s 
— 1A | an ew cae wn “la, | 2 
N-i Bene 2 Pq 
P —+k t=t+1 ad 
eh neh ot oop eee ae ek i ee (21), 
where the integration is performed with respect to all variables except those 
NA,, 
ai ; a Pq 
contained in |@p,|, Apo = ic,0,A% 
order ¢ in the upper left corner of A and A,,“ is the co-factor corresponding to ppg 
in A“. If (21) be integrated with respect to the variables a,, (p,q= 1, 2, ... #) and 
the result multiplied by 








(p,q=1,2,...#), A® is the principal minor of 








we obtain M,(¢)= B+ = —— * 
i r(* ee ) 
é=t+1 2 
Tn-t At 

where B= 2 = 


9n—-t 2 2 . 
BY "OS aaa css Oe 


Therefore, from (2) the distribution f(¢) is 
N-n N-n 
- “aha 


, B2@? pope pe . 
f@=-=—y-> | | if (v0... Un—e2) } 
1 r(->")"" 
i=t+1 2 
ve Cn—t-1 Bo 


xe v1 On-t-n On-t-1dy, Us... GUjgntad s00000000008 (22). 


When t=n—1, we have the distribution of the variance of the difference between 
the nth variate and its estimate from the regression “plane” of the remaining 
n—1 variates. that is 
N-—» 
% i 
9 ———2 =e P 
h(¢)= —y— © ttevevenensensenesoos (22 a), 


= 


N ‘ ; : a 
where By =; and R is the multiple correlation coefficient between the 


2o,*?(1 — R*) 


nth variate and the first n—1 variates. 
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N 


N-n 
N-n-1 


B = N- aed sa 
For t=n-—2 ; == 2 2 Me ieee here eer ee rey: 225), 
or n Ta() r(N=—n) d é (226) 
N2 A (n—2) 
where B, ahd 


a =< See yy Se 
40", 40,7 


6. Generalization of the Correlation Ratio. Let p samples wg (8 = 1, 2, ... p) of 
Nz items respectively be drawn from a normal population of one variable and let 
&, and s,* be the mean and variance of wg. Let be the sample formed by pooling 
the w’s and let its mean and variance be denoted by X and S*. The statistical 
coefficient 7, defined as 

P. 7 
Pad Nz (Zs, —X¥ p 
af) __ © MB) nee 23), 
| NS? ’ (=, B ( ) 
is known as the correlation ratio with the samples wg forming the p categories of 
the independent variable. The distribution of »? defined in this manner was first 
found by Fisher* by his analysis of variance, and later by Hotelling+ by a different 
method. 

In this section we shall generalize the above definition of 7? for samples from 
an n-variate normal population, and find its moments and distribution. We note 
from (23) that ? is the ratio of the weighted variance of the means of the p sub- 
samples wg to the variance of 2. Now, let us suppose p samples wg’ (8 = 1, 2, ... p) 
of ng items respectively to be drawn from an n-variate normal population. Let the 

. ’ . : e i 
sample formed by pooling the @’’s be ’, which will have = ng=WN items. The 


B=1 
statistical coefficient to be considered is the ratio of the generalized weighted variance 


of the means of the w’’s to the generalized variance of Q’. That is 


7 — | bi 
| ay’ 

1 1 & > > o> v 

wnere bi = bji = V = Ng (X — X i) (X i A Ds 
*¥ p=1 

Pp XB er wo 
and ay = aj = > ae (Vigo _ X i) (Vjga — X i)» 

N d A 

4¥ B=1 a=1 


where Xjg is the mean of the ¢th variate in the 8th sample and ajg, is the value of 
the ath individual for the ith variate in the Sth sample ws’. We observe that aj 
can be written as bj; + ¢;;, where 


2 2, = = 
Ci = = mY ee (ipa — X ip) (@jpa —X ia) ENS WENAmeeeebe ee (24), 
0 Biren 


"B 


or setting = (ipa _ X ig) (Vise eas X ja) = Np Vij~ , 


a 


* R. A. Fisher: Proceedings of the International Mathematical Congress, Vol. 11. Toronto (1924). 
+ H. Hotelling : ‘‘The Distribution of Correlation Ratios calculated from random Data,” Proceedings 
of the National Academy of Sciences, Vol. x1. No. 10 (1925), pp. 657—662. 
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| oe - sree 
we have Ci= 7 2 Merus (t,7=1,2,...n)......(24a). 


Clearly vig is the general element in the determinant of variances and covariances 
in the sample ws’. The moments of | a;;| can be written down at once from (10) as 


It is well known that the system of means is distributed independently of the 
system of variances and covariances in samples from an n-variate normal population. 
Therefore, the system {0,;}* is distributed independently of the set {cj}. The 
means in a sample of N individuals are distributed according to 

= a <aisccsilesbctaa ale 
a2 
where A,j is given in (7) and m; is the mean of #; in the population, which, except 
for a factor N in each Aj, is the distribution of the parent population. Therefore, 
the distribution of the set of statistical coefficients {b,} can be deduced from (7), 
for p> n, as 


p—n—-2 


A* Ca 8 nce 


n(n—1) n p— a 
© ir(?s*) 
i=1 





i= ~ 


The distribution of the variances and covariances {vg} in (24a) is given by (7) 
with WV replaced by ng. Hence, it can be shown without much difficulty that the 
distribution of the set {c,} is given by 

N-p . ‘ 
A? ae. er 
@ if=1 : | ci | 2 WO! “sceckines (27). 
n(n—1) n 7 . - 
; N-p+l1-i 
T uv 
i=1 2 


-_ 





One way to prove (27) is to evaluate the characteristic function of the set {c;;} from 
the distributions of the quantities {vj}, (8 =1,2,...p). Indeed, we define the 
characteristic function ¢ (&) of {c;;} as 
» ng—l 
I] | A, | 7 a B . (Bp) — aijng ) vi 
= B=1 = = (4 79a up 
$ (a) = @ p=) is=1 N (2-8 


pr(n— 1) p 


e * 82 r(**) 
mitt \ = 





ng—n—2 


2 
x II |u| 2 dV ...(28), 
B=1 
where dV is the product of the differentials of all the v’s, Z is the set {a,j}, 8;; is the 


* The brace notation, { }, will be used to indicate a system or aggregate, as distinct from the 
notation | | used to indicate a determinant. 
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Kronecker delta which is unity for i=j and zero fori #), Ajj® = eo Aj and ay = aj. 


This integral breaks up into p integrals, each of the form (8). Applying the results 


of (8) on each of the integrals, we get 
_N=p 


J 
= 


—P| i; 


N = 
$(@) =|Ay| ? |40-3—3, 


This is clearly the characteristic function belonging to (27). Hence (27) is the 
distribution of the system {c,;}. We are now in a position to state that 


N-i 
— 3 Ay(byteu) -—— oe a A-* r( 2 +k) 
fe - lay|*]o5] * leg] * dbde=>- I | ~7y=— 








where H= 





n(n—1) . r <5 
—— = p-?t — 
= = Ir ( 2 )r( 2 | 


which is the product of the constant coefficients of the distributions of the two sets 
of statistical coefficients {b;;} and {c,}. If in (30) we set k=—h, p=p+2h and 
N=N+2h, efterwards multiplying by H, we have for the hth moment of U 


: r(==") r (+h) 
M,(U) = Il a 
yp (P)r (eS +h) 


The distribution ¢(U) of U satisfies an integral equation of type (B) and from (5) 

















we have 

n N-i p—n—2 _ a(N—p)_, 

iW r( ; ju 2 (1-U) 3 watt N-»_, 

o(U)='= n p a V DN | | so | (U1 V2 «.. Un—1) . 
\ _ n{+ at 0/0 0 
ars) S") 
(n—1) (N—p) _ (n—2)(N—p)_, N-p_, -(*= ') 

x (1 —%) L (1 — vp) 2 wee (1 04-3) 2. [l—1,(1-U)] 2 


-(* Pa) 
x [1 — {xy + ve.(1 —)} (1 -— U)] 2 eee 


ie ee, 
x [1 — {vr + ve. (1 — m4) +... Una. (1 — 1)... (1 — vy_2)} (1 — UD) ? 
x dv, dv cee dvy1 eeecce weer Seer eee Ler errrrrerrerrrrrrrr rrr rrr Terr rrrrr i rr rrr (31 » 


The range of U is from 0 to 1, since B=1 in (5) for this case. 


N-1 
r( 2 = “=p? 


‘ FF O—U) F  ccaarthasl (31a), 
r (25 l\ r (- =?) 
= } cond 


the well-known result obtained by Fisher and Hotelling for the distribution of 7? 


For n=1 we get 





oi (U)= 
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For n=2 
$2(U) : CF “FP (>) ve a-v) 
ae 7p—1 1 = 
Sts er) 


xf fa(1—m)) ? p—a(l-vyy 2 an 
Rio ew eae ao (apo) a ee 
ee ye re eon ea ae ba 








7. Generalization of 1— 7%. In the case of samples of one variable the distribu- 
bution of 1— 7? can be found from that of 7 by a simple change of variable, but 
such is not true in the generalization which we shall consider. 


From (23) we find that 1- ga »».(32), 


that is, the ratio of the weighted mean of the variances of the samples wg to the 
variance of ©. The quantity which we shall consider as a generalization of 1 — 7? is 


ve aimiecioare gue & aeeaonnb eta a einai (33), 
where {c,;} and {a,;} have been defined in Section 6. 


It can be shown that W arises as the maximum likelihood criterion A, of the 
type used by Neyman and E. S. Pearson for testing an hypothesis H that p samples 
@g are from a subclass d of a class D of admissible populations. In the present 
case D is the class of all sets of p n-variate normal populations in each set of which 
the corresponding variances and covariances are the same, but the means are 
completely independent, while d is the subclass in each set of which the means 
are the same, that is to say in each set of which the p populations are identical. 
The maximum of the Z-function of the samples ws (8 = 1, 2, ... p) for the popula- 
tions of class D is 

ng—n—2 


XY » 
Mp=J\cyj| 7 U [vge| * 
B=1 


The maximum of the Z-function of the samples from populations of class d is 


ug—n -2 


N p 
Ma = J | aj| 2 , J | Vijp | % 


where J is a constant depending on ng (8 = 1, 2, ... p) and n. Therefore 
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The hth moment of W, when H is true, is found from (30) by setting k = —h, 
N=N+2h, and then multiplying by 

















Az 
n(m—1) , ae Ls ay & 
—< r (25 ‘\r (A Re ‘)| 
i=1L - 2 
Accordingly, we get i : 
: p(7S)r Cea) 
M,(W)= — — Gaara oe (34). 

= r(> +n) 0 (= ro) 





The distribution of W is clearly of the form (5) and can be written as 


n _ N=g—6t) a(p—1)_ 
ar(*=)\w t a-w ? 














: é 1 fl 1 p-3 
6(W)= = — ae : | a (U4 Ve... Una) - 
ie (~=2 Tt (° 5 *) ° \ e 
i=1 2 2 
(x—1)(p-1)_, (n—2)(p—1)_, p-l_, _p-2 
x (1-1) 3 (1 — v2) _ «(Ll — 0n-4) 2 [l1—w7,(1-— W)] 2 
x[1—(m+o2.(1—o)}(1—- Wy) = 
_p-3 
Xx [i — jUy+ Ue « ( 1-— v1) + ss Una -(1 _ %1).(1 = Vg) eee (i —Un-2)} (1 = W)] 
x dv, dve eee dvy—1 TYTTTITT TIT TIITTLTTL iT TTTriiirrTiriirirrT rT rire rrr rire te (35). 
The range of W is from 0 to 1. 
For n=1 we find 
\ N —] 
I “>) N-p-2 p-3 
A,(W)= vo a Wt FR ccctzeces (35 a), 
eS) PS) 
the distribution of 1 — 7*. 
For n=2 
,(N-1\,(N-2 
te es 
- 2 2 7 p—-2 
0,(W)= W 2? (1-W) 





bitty? br 
«FPS . e— ,p—-1,1 -w| eee 


At this point we remark that since the elements {b;;} are distributed inde- 
pendently of the quantities {c¢;;} and since the distribution of each system is 
essentially of the form (7), we can deduce from (13) that the kth moment of the 





el 
ratio — 1s 
| c,;| 1(P—*, s\n(N-ptl1l-i_ , 
ee te 
fed r (p= p(Aapti-® 
Sinn ae 


and its distribution can be found from (16), 

















S. S. WILKs 487 
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8. Generalization of “Student's” Ratio. For samples of a single variable 
“Student’s” ratio is defined as the quantity 


Pe (—m) 
8 
where s is the standard deviation, the mean of the sample and m the population 
mean. In a recent paper, Hotelling* has generalized the statistical coefficient z* 
for samples from an n-variate normal population and has found the distribution 
of J? which is the product of the generalized z* and the number of degrees of 
freedom in the sample. However, we shall show that the distribution of this 
generalized ratio can be reached by the methods used in Sections 6 and 7. 


In a sample of NV individuals from an n-variate normal population, let the set 
of variances and covariances {a;;} be defined as in Section 3. Let the sample means 


be {X;! and the corresponding population means be {m;}. The distribution of the 
) te) t 3 


set {a,;} is given by (7) and that of {X;} by (25). The statistical coefficient which we 
shall consider first is 


ft: See eT oa ..(36), 
| ess 
where C5 = Ay + (X;—m,) (X; — Mj). 


It is not difficult to show that Y arises as the maximum likelihood criterion 
Aq for testing the hypothesis H’ that our sample is from a subset d’ of n-variate 
popuiations D’, where d’ is the class of normal populations having a specified set 
of means {m,} and any set of variances and covariances, and D’ is the class of all 
n-variate normal populations with any set of means, variances and covariances. 
Proceeding as in Section 6 for Ag, we find 

dae ey] - 
‘s | e3| 

By the procedure used in finding the distribution of {c,;} in (27), we can show 

that the distribution of the set {e;;} is 








n(n—1) * << e e 
e* & r(——) 
° 1 % 


7= -_ 


Therefore, the kth moment of |e,;| is the same as that of & in (10) with V 
replaced by V +1. Since the means {X,} and the system {a,;} are independently 
distributed, we shall have 


F y = N-—n—2 
— & As ipt-(Xi— ma) (Xj—mg) ° : aoe a 
e 5 fig lag t+-( Xi (43 11 e,5 |" | ay; | 2 dadx 


1 N-i 4 Vai —<¢ 
te @ “8TI 


is 21 |—_—q——— | ............ 38). 
= A Tv : as iN 4- = a ( ) 
4 ss ae ae 


* H. Hotelling: Annals of Mathematical Statistics, Vol. m1. (1931). 
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Changing NV to N+ 2h and & to —h and multiplying by 
ss 





n(n—1),n 
( , n 


“ 4 ‘irr ("5 ‘) 








r (5) r (73 +h) 
we get M,(Y)= V = Worm (39). 
r(3 +h) P( >) 
Hence, from (5), the distribution of Y is 
r(Z) N-n_, ny 
F(Y)dY= ies ES ? A eee (40), 


with a range from 0 to 1. 


By breaking up the rows or columns of |e,;| and expressing |e,;| as a sum of 
the resulting determinants, it can be readily shown that 


—— 
Bre. «dh 
l+—; 
N-1 
T? = Cn. — = 
where c= h (Xp —m,) (X_— mq), 
N-1 P.4 1 | ay | 


and G,, is the co-factor of a,, in | aj|- 


Making the change of variable from Y to T in (40), we find the distribution 
of T to be , 





N\ 

9T 4 

21 (5) [47 oa 
oi Swe wee ae: sean 
I \-3 }} (5) V=1) (14 a 


which is the distribution established by Hotelling. 


We note that (41) has been derived without making use of the property of the 
invariance of 7’ under all homogeneous linear transformations of the n variates in 
the population. This property, however, played an important part in Hotelling’s 
derivation, 

9. Generalization of the \-Criterion appropriate to k Samples. In 1931, E. S. 
Pearson and Neyman* considered a certain maximum likelihood criterion Ay for 
testing the hypothesis H that & samples are drawn from a subclass @ of a class 0 
of admissible populations, where Q is the class of all sets of / univariate normal 

J. Neyman and E, 8. Pearson: Bulletin de VAcadémie Polonaise des Sciences et des Lettres, Série A, 
Sciences mathématiques, 1930 and 1931. 

Ap (mn) @Nd Ajzz/ (yy) Of this Section and the 7, of Section 7 are respectively generalizations for the case 
of n variables of the X77, \y, and Xj, used by these writers in the case of a single variable. 
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populations and @ is the subclass in each set of which the k populations have the 
same means and standard deviations. This criterion, for the case of k samples of 
one variable, was found in the following manner : 

The maximum of the Z-function of all k samples from populations of class 0 is 

Nk ng—3 
My = C(S,?) 2 
p=1 
where C is a constant depending on the ng’s (8=1, 2,...%), the numbers of indi- 
k 


viduals in the samples, and N= = ng, s,* is the variance of the 8th sample and S,? 
p=1 
the variance of the pool of the & samples. 


The maximum of the joint likelihood of the & samples from populations of class 


@ 18 
n " ng—3 


k k ia 
M,, =C II (sg*) 2 II (sg?) 2 
B=1 1 


For the ratio of these maximums we have 


vane ot (28) 
H = —_ «== = 2) me 
Mo B=1 So 
which is the criterion adopted by E. S. Pearson and Neyman for testing H. 
The generalization of this criterion for testing hypothesis H on k samples from an 
n-variate normal population is straightforward. Indeed, the generalized criterion is 


ne 
k Ses 2 
Anim = I | val Fickle haccas eee (42), 
B=1 Pip 


where |s,j8| is the generalized variance of the 8th sample and |S,»| is the 
generalized variance of the sample formed by combining the & samples. 


To find the moments of Ay (pn), when the hypothesis is, we proceed as in Sections 6 


k 
" ; Te if 
and 7, and deduce at once from (10) the éth moment of | S;9|°" | V= = ny} to be 
: an 


(N(1+#)- ‘| 


_Nt an I \ 9 





But, since the elements {S;;} are expressible in terms of the variances, covariances 
and means of the & samples under consideration, we must have 


n  & = = , ng—n—2 Nt 

= 3 3,4 ant Kop Tis") TTL 5..5] 2 — |Syo] ? ded X 

@ iJj=l gol | Saga | ~ Siso| QS Gs 
B=1 


k af —2 ~ N(1+#)-1) 
ng kn(n—1) , kn II I ( e. \t Coe 





Nt 5 

ae Ts 2 3=1 

=A *[Il A, ?2 * ?11]& 
B=1 
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° ‘ ° n ° 
where Ag is the determinant A with N replaced by ng and A;,,;@) = v Ae: Setting 





Ng=ng(1+h), N=N (1 +h) and t=—- ot , and afterwards multiplying (44) by 
ee 
k 2 
i n(n—1),n , io —*< 
i=1 27/4) 


we get for the hth moment of Ag yn) 














vhs vg _[r (™ (1 ch : r CF v ) 
Mi, (xm) = I = -) Il — I] - - 
h (Anim) oy 1 Pos r (“e=*) i=1] p Ga +h) — *) 
2, i: 2 


For the case n= 1, we have 


[../ne(1+h)-1 r N-1 
\ wt r( 2 ( 2 
M,(gay)=N ? II 


= : — N(1+h)-1\’ 
B=1 ng"8 r ("85 ~) r( ( ~~ ) 


which was found by E. S. Pearson and Neyman by direct integration. 





Let us modify the hypothesis which yielded the criterion Agi») in (42), and 
suppose that @ is the class of all sets of n-variate normal populations in each set 
of which the corresponding variances and covariances are the same, but the means 
are completely independent. Let the definition of Q be unchanged. Then clearly 
@ is a subclass of O. Let the hypothesis that the / samples are from the subclass o 
of populations 2 be H’,,). Then the E. 8. Pearson and Neyman criterion Ay (n) 
appropriate to H’,,) is readily found to be 


Ms _ * [|sul]2 
An’ (n) = yy < le , 


B= 

k 

There rn = Pas 

where Ci = Hy — NpSijz- 
N p=1 : 


The distribution of the set {c;;} in this case is given by (27) with p replaced by k. 
Proceeding exactly as in the case of Ay (nm), we find that the Ath moment of Ay ¢q) 18 








1+h)—2\\ 7 V—k+1-—-i\ 

k Mi nhng - 4 (“# ns ») +) ‘ r(- = “| 
Mi (\xm)= U (=) * fans 1 ' etgreamneoen 
. ni = ») p=1 U7} has . (np—t 7, rT N(A+h)-—k4+1-0 
Tr J 
Sean (45 bis). 


For the case n= 1, we have 





k N inp a (** (1 2 —*) & (7 ~ 4 
M;,Qu a) = I (: 2) : - N(i+i/ =3 ‘ 
p=1| \n r(“*5*) r(- (1 +h) ‘) 





B 
2 2 


as found by E. S. Pearson and Neyman 
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10. Moments and Distributions of Ratios of Determinants of Correlation Co- 
efficients. The distribution of the correlation coefficients r in samples from a normal 
population in which the correlation is zero was first suggested by “Student ”* 
and later verified by Fishert. From this distribution one can readily find that of 


: , ; lr : : 
1—?7* which can be written as the determinant | 1: In inis section we shall 
- 


first consider the moments and distribution of the determinant of the correlation 
coefficients in a sample from a normal population of n independent variables. 
That is, we shall find the moments and distribution of |7r;;|t, where rj4j= rj; and 
rx=1. The distribution of variances and covariances in a sample from such a 
population is given by (7), where the p’s are all made zero. 


Hence, we have, corresponding to (8), 


* N-—n—2 n(n—1) re Dp N-1 
— = Ajai ; a ,_(N-1 —— eet 
le a““la| 2 da=e * Hir(-—)a, * | ...... (46), 
J g=] \ - / : 
N 
where A;= —;. 
20; 


If we change the variables by the transformation 


ij = Ni V axitj;, (t, j= 1, 2, ... n; 8#7) 
then 
2 N-3 N-—n-—2 2 (n—1) 7 y * N-1 
— 3 Aas = ; - N-)\ . = | 
Je st" (Gndag... nn) ? |rg| 2 dadr=er * II r( = \ A; ‘ | 
‘ i=1 2 / 
Soken (47). 


{a,;} can be shown by 


evaluating the characteristic function of {a,;} which is known to be 


a _N=1 
$@=1 E = (A;—a) ? |. 
i=1 


since the a’s are variances in samples from independent populations. This character- 


That the set {r,;} is distributed independently of the set 





istic function must also satisfy the relation 
a ‘ N-—3 N—-n-2 
~ 1 f — 3 (4:-a)au = — 
 (&) = 7 le i=l (yy Geog... Ann) ” |rs5| 2 = dadr, 
where H is the quantity on the right side of (47). From Stekloff’s theory it 
follows that 


N-n-2 


|r,;| 2 6dr 





* “Student”: ‘*The Probable Error of a Correlation Coefficient,” Biometrika, Vol. v1. (1908—1909), 
pp. 302—310. 

+ R. A. Fisher: Biometrika, Vol. x. (1915). 

t See Ragnar Frisch: ‘‘Correlation and Scatter in statistical Variables,” Nordisk Statistisk Tidskrift, 
Vol. viit. (1928), pp. 36—102. 

In this paper Frisch considers the significance of the determinant |r,;| and its principal minors in 
the procedure of fitting linear regression equations to scatter diagrams with special reference to the 





matter of detecting the presence of irrelevant or uncorrelated variables. He refers to the quantity 


+ JT r;;| as the collective scatter coefficient of the sample, but he does not consider the problem of 
finding its sampling moments and distribution. 
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is necessarily a constant independent of the a’s. Thus, we can deduce from (47) 
that the kth moment of |r,| = @ is 


ho (75 *) nT (= me +k) 





Bi es 
M,()= —— Ee (48). 
pe (“ > +4) ir (">") 


The distribution of » can therefore be written from (5) as 


N—-n n(n—1) 
| cee! (7 *) @ 2 a (1 _ @) + ay . n—2 


it Ir es ‘ r (>) 


n(n—l) 3 n(m—1) 5 n—3 





x(l—m) * 2(l-m) * 2...(1--m-2) 2? {1-4 (1—)}4 

x [1 — {oy + v2 (1 — 11); 1 —o)f"... $ 
x [1 — {og + ve (1 — 01) +... + ¥,_-2(1 — 1)... (1 — vn_s)} (1 - oy? 
ae, EER iy Sp eae, 88 SNe HERR A cap Oe ee Oe WAP (49). 


For n = 2 we have 


the well-known distribution of 1 — 7°. 
For n=3 
‘ 


[? (- — *) a > “(1 —w)t 


= N-2 N-3 
rors CE) 


-_ _ 





Fs(@) = F[3,4,3,l-o] ...... (490). 


At this point we shall introduce a slightly more general function of the r’s and 
consider the ratio of w to the product of & of its principal minors which are 
mutually exclusive. Without loss of generality, we can consider the k minors as 
placed corner to corner down the main diagonal of |r,|, such that each element in 
this diagonal (all equal to unity) is included in one of the minors. 





Thus, let F on 


where wg is the @th principal minor from the upper left corner of and contains 
k 

the inter-correlations of pg variates and = pg=n. If we multiply and divide the 
B=1 


quantity on the right in (50) by II a;, we have Z expressed as the ratio of the 


‘= 
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generalized variance to the product of & of its principal minors, all of which are 
mutually exclusive and are such that every element of the main diagonal of |a;;| is 
contained in one of them. If (46) be integrated with respect to all of the a’s not 
contained in this set of principal minors, which will be denoted collectively by @, 
we must have 





_8 ia N-n-2 
fe alas) 2 da 
n(n-1) » -_ 
wr * I hy (- ° *) a er N—ng—2 
he inl _ = laje| 2 + d(a—@)...... (51), 

; Pp Pg) Pg N-a s=1 

ll [= i r( )| 

B=1 L a=l1 2 


where |ajjs| is the th minor in |aj|, and a—@ is the set of a’s in (46) not 
contained in @. 


k 
If both sides of (51) be multiplied by II |ajg|-*, which is constant as far as the 
B=1 
set @ is concerned, and if NV be replaced by N + 2h, afterwards integrating with 
respect to a—d, then multiplying by 
N-1 
(Ay Az eee A.) 2 


n(n—1) in . 

2G nw NI 

7 * UT(=5 ) 
i=1 / 





we obtain the expression for the hth moment of Z, which is 
N-a N-i ; 
1 me) TS") |. +4) 
M,(Z)=11 Il — —_——_—___—_—. 
: N-a -% 
B=la=1 i=1 7 
r( > + h) I 4 = z 


\ 











This is clearly the hth moment of a function satisfying an integral equation of 
type (B), and hence the distribution of Z is of the form (5), where Z ranges from 
0 to 1. An important case of (52) arises when 8 = 2, pp=n—1, and p,=1 which 
yields the hth moment of 1— R*, where FR is the multiple correlation coefficient 
in a sample from a population with its multiple correlation coefficient zero. 


That is 
7 NW 
r (- > ) r (7 - nr 4 h) 


. 











M)(4) = : , 
: ,(N -1 \ N—n 
r(—S +4) 8 /( =) 
From this we get the result 
4 (N— 1) 
I \—> } N-n-2 n—l_, 
A(Aé)= > ~ nm * G=a)* ; 


N—n\ ,, (n—- I) 
r( 2 *) se 
from which we can deduce the distribution of R®, by the simple change of variable 
4,=1— R?, a result originally obtained by Fisher. 
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The distribution of Z can be found for a slightly more general problem than 
the one we have just considered. Thus, suppose a sample of V items is drawn 
from each of k independent normal populations, where the Sth population has pg 
inter-correlated variates. Let vg be the generalized variance of the th sample 
and V the generalized variance of the & samples treated as one sample with 


k 
= pa=n 
p= 


variates. Then it can be shown in a straightforward manner by the foregoing 
method that the hth moment of 
V 


k 
II UB 
B=1 


=Z 





is identical with M), (Z). 

The use of Z as a criterion for testing the hypothesis that the & samples are 
from k independent normal populations can be interpreted as an extension of the 
use of 1 — R? as a criterion for testing the hypothesis that a sample of WV items of 
a single variable and a sample of WV items of n—1 variables are from independent 
populations. For example, the criterion Z, appropriate to testing the hypothesis 
that a sample of WV items of two variables and one of N items of n — 2 variables 
are from independent populations, will have its hth moment given by (52) for the 
special case B= 2, py = 2, and po=n — 2, that is, 


N-1\_/N-2%\../N- v- 
(7S) r (7) rt +ar ett +h) 


N—n N-—n+1 N-1 . N-2 oa 
r(- )r (3 )r( 7 +h)r( 3 th) 


= -_ _ 








M;, (Zs) = 





and hence, from (5), we find the distribution of Z, to be 
N~3\.j/0-i 2 —-. * 
r( 5 )r( 5 )Ze 2 (1-Z,)3 


,(N-—n+1\ N—-n 
r(: = )P( 3 )P@-2) 


— / 


f2(Z2) = 





« 9 acl 1 
,jn—-3 n—-2 : ; 
F| = (eR 


~ 


It can be shown that Z is the \-criterion appropriate to testing the hypothesis 
that the » variables in the population fall into & groups, in each of which the 
variates may be inter-correlated, but such that no variate in one group is correlated 
with any variate in another. 

The practical application of the criteria developed in this paper must be left 
for further discussion. 

















MISCELLANEA. 


On a Method of proceeding from partial Cell Frequencies to 
Ordinates and to total Cell Frequencies in the case of a 
bivariate Frequency Surface. 


By JACQUES CHAPELIN, D.Sc. 


It may happen that to define a bivariate population, the whole population in every cell is not 
observed, but only the population in a partial cell. 


For instance, in order to get a rough evaluation of the amount of wood in a forest, foresters 
used to divide the area of the forest into rectangular cells, and in each of the rectangles to 
measure only the volume of the trees falling inside a partial domain, sometimes a strip parallel 
to one of the sides of the rectangle, coaxial with it and with breadth one-tenth or one-twentieth 
of the other side. Calling @ the (measured) volume corresponding to such a strip or partial cell, 
it is required to find a plausible value for the volume / inside the rectangle or total cell from 
which, by addition, a plausible value for the volume of the forest may be deduced. An easy 
solution would be to use a simple rule of proportionality: the volume of wood in a total cell 
would be taken equal to the volume in the corresponding partial cell multiplied by the ratio of 
the area of a total cell to a partial cell. This supposes that the z-ordinate corresponding to the 
ideal density surface is satisfactorily represented by the z-ordinate of a hyperbolic paraboloid. 
A more refined solution would be to use Pearson’s interpolation surface of the fourth order*. 
The aim of this paper is to obtain the necessary formulae: in the general case, the value of f for 
a total rectangle # is a linear and homogeneous function of the values of the nine @’s correspond- 
ing to this rectangle # and to the eight rectangles adjoining /#, and this linear form is defined 


by the first line of Table III. 


To take another example, let us suppose we have a population of WV living animals, classified 
into classes, according to a character Y, to which corresponds a first variate x We wish to 
study the lethal dose y of a drug, according to the character XY. For reasons of economy, we do 
not want, either to spend too much of this drug or to kill the whole population. Then, we 
decide to try the drug on one-fifth of the population. We divide each class (v— 4, +4) into 
five equal parts, and we experiment only on the animals belonging to the middle class. Thus, we 
are led to partial cells with breadth } and height 1, giving a two-dimensional set of ¢’s from 
which we shall have to deduce the /’s, in order to be able to build up the usual correlation 
Table. To that effect, we could also use the formulae defined by the Table III of this paper. 


We shall suppose that the basic net is a system of squares with unit sides and that the 
system of partial cells consists of rectangles concentric and coaxial with the basic square cells, 
We shall use the ordinary Pearsonian interpolation formula corresponding to the mid-panel 
central difference formula up to and including second order differences. The sides of any of the 
rectangular cells will be a and 8, and we shall call doo, 01, ... the nine observed frequencies 
in nine partial rectangular cells, according to the usual Pearsonian scheme. If a=8=1, the 
quantities oy reduce to the usual frequencies f,. The problem is to find the nine total frequen- 
cies f, when the nine partial frequencies dy. are known. 


* Cf. Biometrika, Vol. xvi. p. 312, or Tables for Statisticians and Biometricians, Part II, p. xiii. 
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We have immediately the following integrals : 


+}a 3 -+ta 3 
2 e a : . es a 
[.0-*)de=0-5, | 2-9 er=-, 
r1+}a ° 1+}a 3 rl+}a 3 
ee Se _ 72) ee es / — ball 
oe 1—w)dx= 12? fw? a?) dx 12” Jaq 2 tee 2a +75: 


from which we deduce at once Table I leading to the expressions of the @’s as functions of the 
ordinates z. From this table, we can write the equations 
576 2 2 » 2 2 
a8 Po=4 (12 — a") (12 — 8?) 209 + 0° 87241 + 2.05 «000 
t 


The resolution of this system of nine linear equations leads to Table II, from which we can write 
the equations 


57 6aBzop = 4 (12+ a7) (12+ 8") hota ?gut..., o- 
At last, eliminating the ordinates z;, between these nine equations and Pearson’s formulae 
(Biometrika, Vol. xvi. p. 312, or Tables for Statisticians and Biometricians, Part II, p. xiv, 
formulae (a) to (c)), we obtain the formulae defined by Table III. They would read 
Foo=4 (11 +a?) (11+?) hoot (1 — a?) (1 —8*) but..., .-- 

It should be noticed that we obtain Pearson’s formulae (Joc. cit.) by supposing that, in Table I, 

a=8=1, or that, in Table ITI, a and 8 tend to zero, and that ee tends to zz. Similarly, Table IT 
ay 


should lead to the other set of Pearson’s formulae (Biometrika, Vol. xvit. p. 313, or Tables for 
Statisticians and Biometricians, Part I1, p. xiv, formulae (q’) to (v’)), by putting a=B=1. As this 
is not so, these formulae should be replaced by the following* : 

5762 = =676footfutA-it/_-utf-1-1—26 (fy, +fo-1+f0 +f -10), 

57621, = 4 fy +529F,, +f_1-1- 23 (fp- +f -11) +46 ( for thio) — 2 (fo-1 +F-10)s 

5764-1 =4f+5294,-, +f —23 (fir tf-1-1) +46 (Jo -1 + fio) — 2 (fr +F-10), 

57623) =4fyt+529f_ 1. 4+/1-1— 23 (Ar t+f-1-1) +46 (fa t+f_w) -2 (fo-1+Aio)s 

5762 _3~1=4 foo + 529f_ 1-1 +f — 23 (Ar_-1 + f_11) + 46 (fo-1 +F-10) — 2 (Sar + fio) 
5762) = 52fiy + f_1-1 + fi-1— 23 (fur +f) + 59871 — 2670-1 — 2 ( fro #F-10); 
5762 _1 52foo thi tf—_u — 23 (fi-1+f_1-1) +598f/ -1 — 26fn — 2 (fo +F-10); 
576219 -52fyy +f_u+f—-1~-1— 23 (fir +fi-1) +598%0 — 267_ 10 — 2 (far +fo-1), 

= 52 


5762z_19 =52foothyt+fi-1— 23 (f_u+f-1-1) + 598/_ 10 -- 26/0 — 2 (Sin + fo -1)- 


Lastly, if the basic cells are rectangles with sides h, k, and if the partial cells are rectangles 
with sides ’, k’, concentric and coaxial to the rectangles of the basic cells, the right-hand sides 
of the equations deduced from Table I should be multiplied by 44, and a and 8 defined by the 
relations h’=ah, k'=8k, the right-hand sides of the equatious deduced from Table II should be 


divided by Ak, and the right-hand sides of the equations deduced from Table III should remain 
unchanged. 


* {I am extremely obliged to Dr Chapelin for h’ correction of my formulae. K. P.] 
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On the Betas of Quadrilateral Distributions. 
By OWEN L. DAVIES. 


THE Pearson Type curves are generated by a differential equation of the form 


ldy_ ——x+a (1) 
ydx by+b,x+by22 ; 





These curves cover the whole range of distributions which are likely to be encountered in the 
field of practical statistics. There are, however, several possible distributions not included in (1) 
which, although rare in actual experience, have more than mere mathematical interest. Such, 


, ; , : sn - : 
for example, is the trapezium or triangle and in fact all curves for which Te is not continuous 
- ax 


at all points. The most interesting among these are the trapezia and quadrilaterals with one 


dy . 
#4 namely, figures of the type 


dx 


point of discontinuity for 





1 
| 
1 
1 
| 
| 
l 





Triangular, rectangular and linear distributions are all particular cases of these. 
eS 5 


I. Distributions following a Trapezium Law. 


Y] ¢ E 








| 

| 

1 

| 
AB 





D 0 x 


OD=d, OA=a OB=b, OC=e, 


Range=DB=d+b, 


The equations of the lines DC and ZB are respectively 


ae, 
n=ec \i +5) ’ 
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Let M,’ be the sth moment of the whole figure about the vertical through 0, then 


b a fd 
4, = { | nerdere | wae} +] nwvdz 
Ja 0 0 
x c [ (bs*+2 —qt*+2 ‘ : 
~ (8+1)(s4+2)Ll b-a be(—yar]. 
Let b=r cos 6, \ 
a=rsin 6, A=cos 6 sin 6}> 
d=np, J 





crt+1 cos**+?@— sin*+26 
ther = ———— | ——__________- —)}*p**! 
: ‘ anes | cos@—sin@ + (—**? | 
and in particular My =< [(cos 6+sin 6)+p]=J the total frequency. 


Now M,’ = Nyp,', where p,’ is the sth moment coefficient about 0 and, tnerefore, on simplification 
the first four moment coefficients become 
cae (1+cos 6 sin 6) — p? 
ans (cos 8+sin 6)+p ’ 
Eis (cos 6+sin 6 )+p® 
~ 6 (cos 6+sin 6)+p’ 








Pe 


> (1+cos 6 sin 6 —cos?@ sin?@) —p* 
P30 (cos 6+sin 6)+ ) . 
a r (cos 6+sin 6) (1 —cos?@ sin?@) + p® 
Me 15 ‘cos 6+sin 6)+ p ’ 





Referring these moments to the mean, 


i ekicieeetinn 2d — 22) +5 2( 3p%e+p! 
Be 18 (p papell +2d 2d") + 3pe + 4p? (1 +A)+3p%e+p h 

s . « 9 2\ oY © 9 or Ss 
P3= 270 (pre) [(2+6A—3)2 — 34d) + 9pe (1 +A— 6A) 4+ 3p? (4—A— 29A*) 


— 45p*ed — p* (12 + 39A) — 9p*« — 2p"), 
ar a, > 2 . , 9 
M4= 3 (pte [(1 +A)? (142A — 5A) + Gpe (1 +A) (1+A— 3A?) 
+p?(17 +42 — 9A2— 52A3) + 6p%e (5 +6 —5A2) + Bp! (124240422) 
+6p%e (5 +4) +p? (17 +26A) + 6p*e+p'], 
where A=cos 6 sin 8, 


e=cos 6+sin 6=(1 +2a)2. 


The first two §’s, By =f , B=, 
Pe Pz 


of distributions following a trapezium law will thus depend on two parameters A=cos 6 sin 6 
and p. Now >a, 6 and a both positive ; consequently, \ is always positive and less than or 
equal to $. Moreover, all distributions giving rise to different §’s are covered if we take d <6, i.e. 


O <p <(cos @—sin 6) <1. 


By allowing A and p to vary within the above limits, 8; and 8, will be seen to trace out au area 
of finite extent on th* 8), 8. plane. The limits to this area may be found by investigating the 
B;, Bz lines of particular subcases. 
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(i) d=0, we. p=0. 


These distributions will correspond to figures of the following type: 








! 
¢ " 
l 





O a b 


Their moments and f’s may be obtained by putting p=O in the general relations above. 


These give 


pe SASEA 
pee 3 TON 
r 1 : 
.=— — (142A - 2X2), 
He=T5 42a) 7 
rm : m 
p3=—— ———; (2+6A—3d?— 3403), 
270 (1+2d)8 
274 1 1422 (1 2) — 5X2) 
Mi= 370 (1+ aaye EFA (+ 2A— 82%), 
8 (2+6A—3d?—34a3)? 
whence 


*=T00  (1+2\—ane 
24 (1+A)? (1+2A— 5A?) 


2. = — 
Be=79 + 2A— 2a 





O<ArA<h. 
B,, Be of such distributions will trace out a line of finite extent on the 8, 8, plane connecting 


the line point Z to the rectangular point 2. These are, in fact, limiting cases corresponding 
respectively to A=0 (a=0) and A=} (a=D) (see Fig. 1). 


(ii) a=0, te. A=0. 


The distributions will now be triangular, corresponding to figures of the type 


C 





D O B 


Putting \=0 in the general relations, we have 


re 
2= ,|1+3 4p? + 3p? +- p4], 
Pe 18 (p+1 g{1 +3p-+4p*-+3p*+ p*] 
rs " 3 
Bs 270(p¥1 3 [2+9p + 12p? - 12p4— 9p* — 2p], 
2r4 
Ma= 


270 (9 +1)! [1 +6p+ 17p? + 30p? + 36p! + 30p* + 17p®+6p7 +p]. 
ZIU(f ) 











Miscellanea 501 


B-— 
O “O05 10 415 20 25 -30 = -35 
1-7 ea | l I | | 











i 

y 4 = 
ee et ee ee et 
2-4 i 


—---—DENOTES BIQUADRATIC. 
--=-- DENOTES LIMITS FOR BAN Ee 








25+} ——penotes (1-8)- (i-Z Ba )(s+a/1- B) 


* DENOTES THE sidniae tent LINE A= = 
FOR THE 85 OF FIGURE 2. 


These may be simplified considerably by writing 














rp= Rh cos ¢, 
r=Rsin ¢, y= cos @ sin ¢, 
R 
whence Be i8 l+y 
p3= = l 2y 2 +5y 


Biometrika xxtv 32 
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2 (1—2y) (2+5y)? 


195 (1+y) ? 


giving 





12 
R= O<y<b-. 
Bz has the same value for all triangular distributions, while 8, varies between 0 and ‘32. 


The §;, 8, line is, therefore, of finite extent, parallel to the 8; axis and joins the line point to 
a point J on the 8, axis. J corresponds to the symmetrical case, i.e. an isosceles triangle. 


(iii) Symmetrical Case (b- a)=d, i.e p=(cos 6—sin 6). 


Substituting p=(cos 6- sin 6) in the general relations, we have 


9 


‘ 
ace oo oF cs 
Pe 3 (prep A —d*) + pe (1—A)] 
ro 
= 54 (3 — 4A) + pe], 
p3=90, 
z . 2 2 
h= [(8 — 16 — 18d? + 4243 — 9A4) + Qpe (1 —A) (4—4A— 5A?2)] 
15 (p+e) 
—— Pe 
whence B, =0, 


6 (19- 44d +18A2)-+(13—20A) ep 
10 (5—12A+6A2)+(3—4A)ep ” 


ep=(cos*@ — sin®@) =(1 —4)2)t O<A<b. 


— 
The substitution (1- 4y*)2 =2 i = oD : = 


Bo= 


will reduce 8, to the simple expression 
8,= 33 (17%). 
When 


A=0, then I 
and ' 


A=4$, then y=} 
The limits for y are, therefore, the same as those for A, namely (0, $) 


3). 


8, thus varies between 1°8 and 2°4 and the §;, Az line is that part of the 8, axis lying between 
the rectangular point 2 and the isosceles point J. 


Differentiating 8, and 82 for the general case, we find 


( : =) =0, (3°) 0, 

op J p=a=0 OA / p=a=0 

(2) =0, (#) =0, 
op / p=a=0 on p=a=6 


P a De f — 29 
and (82) ,—p-0= 2°45 (B;) , »=o™ 32. 


The B,, 8, area is, therefore, bounded below by the line B.=2°4. Moreover, 8, never exceeds 
‘32 and £, is never less than 1°8. It is fairly evident, then, that the area traced out by 8, and 
8, of distributions following a trapezium law is bounded by the §;, f: lines of the following three 
subcases : 

(i) triangle, 


(ii) symmetrical trapezium, 


(iii) the quadrilateral formed by a rectangle and a right triangle (see Fig. 


_ 
~~ 
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II. Quadrilaterals with One Point of Discontinuity for dy/dx. 


Fr 


mo 
/ 
/ 
/ 








1 

I 

i 

| 

l 
0 A B 
OA=a, OB=b, AC=c, ED=d=pe. 


The sth moment of the quadrilateral OFCD about O is given by 


c bs*+2—qst?2 
‘= — s+1 
M, einars| i< |, 


and My =5 [(b+a) + ap]. 


Write b=r cos 6, 
a=r sin 6, A=cos 6 sin 8, 
then the first four moment coefficients about O become 
r 1+A)+p sin? 


Ai =3 (cos 6+sin 6) +p sin 6” 


a 7 (cos 8+sin 6)+ ) sind 
pe 6 (cos 6+sin 6)+p sin 6” 


3 =10 (cos 6+sin @)+p sind’ 


,__ ™ (cos 6+sin 6) (1—A*)+p sin®6é 
Pa"15 (cos 6+sin 6)+p sin 6 ; 


Referring these moments to the centroid vertical, 


( i cas ed = 2h — 222) fs —X)+sin? 6 (2— 2 sin# 
(A) pe 18 (+p sin 6 LL +2A 22) +p {3A (1 —A)+sin® 6 (2—A)}+p? sin! 4], 
Pe 
=" _ 19.4. 6) — 302 340) +p {9A (14+3A— 7A) +8in? 3A — 39d2)} 
Bg 370 +p sin op! +6 — 3d2— 34A8) + p {9A (143A — 7A?) + sin? 6 (6+ 3A — 39A 
+ 3p? {2 sin? 6 (1+3d) (1— 2A) —A2 (8A — 7)} + 2p sin® 6], 
2r4 
nn. 2 (1424 —5d2) +p {3d (242A —5A2— 5a’) 
Ms 270 @+psin oleh +) (142A —5A*) +p {8A (242A >, 
+sin? 6 (446d — 12A2— 5A3)} +3p? {2 (1 — 5A?) + sin® 6 (24+ 4A — TA? 405)} 


+p? {v2 (— 445A —3d2) +sin? 6 (4+ 4 — 10A2— 7A5)} + pt sin® 6]. 
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Subcases. 
(i) d=0, te. p=0. 
The moments and §’s reduce for those already found for the figure (p. 500). 











(ii) d= —¢, te. p= —1. 


This subcase corresponds to triangular distributions, and on substitution the three moments 
reduce to 


7 
= (] — A), 
Peis \” 


Ps= 575 (1+2d)4 (25a), 


2 (142A) (2-—5A)? 


gong A395 a-y 
12 
A= = O<A<3. 


This represents the same line already found for triangular distributions. The A, however, has 
a different meaning. The figure is symmetrical when a=2), i.e. cos @=2 sin 6 or A=%. For this 
value of A, 8;=0 as expected. For A=0 or } the figure corresponds to a right triangle giving 
B.=2°4, 8; =°32, the coordinates of the line point. 

(iii) a=), te. A=$. 


The distributions will now be linear and correspond to figures of the following type. 


Cc 


” 
* ‘i 
- 


5---9 a 


Substituting A=} in relations (A), we have 








9 


P 
(<3) 6+6p +p" 








= 18 (p+2)? ’ 
( r y 
J2) p(9+9p +p?) 
-s™="s ee 3 


270 (p+2)s 


(45) 
= 3) 27+54p+ 39p? + 12p* + p* 
nm $70 (p+2)! , 
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32 ,(94+9p+p*? 


giving Bi= 1009 P (6+6p +p)” 
8 __ 1227+4+54p+39p? + 12p3 + p* 
rs (6+6p+p")* ; 
Put (p+1) ae and )’=cos 6 sin 6, 
‘cos 6 
32 (1—2n’) (14+ 70’ 
then \ at R ALY, 


m10o-CtC a” 
12 (14 7’) (142) 


= —— 0<)' <}. 
B2 5 (1+4)’)? a 


We may give the following interpretation to \’.. Let 0'O=b, O’a=a and write 
b=rsin 6, 
a=r cos 8, =a? +b, 
then )’ will be equal to cos @ sin 8. 


The curve* connecting 8, and £, is 
Td | 2 


8 B2\ /. (Bs , 
(1 =, 5) _ (1 —24) \s+8 ft - 5) sudkoud olebeteacchstawneons (2). 


It passes through the points 


which are, respectively, the coordinates of the line point Z and the rectangular point 2. 


Clearly, 8. must be less than or equal to 2-4, and accordingly 8; must be less than or equal 
to °32. 

The line (2) may be readily plotted from the parametric equations by allowing X’ to vary 
between 0 and }. This line is of finite extent connecting the line point and the rectangular point, 
and lies entirely within the biquadratic loop which is of fundamental importance in connection 
with Pearson’s Type curves. It is of interest, therefore, to determine how closely a Type I 


r\nm, x \m. 
y= (1+2) (1-=) pebkseneeauienes ot onberasedubhbeeeneeee (3) 


The best fit is obtained by identifying the first two moments and the range in both cases. 


curve 


will fit a linear distribution. 


For (3) we havet 





? my’ 
a my +m,” 
x mM,’ Mrs’ mM,’ =m,+ 1) 
=, ? ng ; ; , * 
(my, +m, )2 (My + Meg +1) My =mM2+1) 
"2 my ? 
Hence os (my' + mo’ +1), 
ia Me 
l Me" 
; =(1+-4}), 
Mf m, 
i ae 
p an ee =, (1 — py’) wy,” ’ 
and, therefore, m=m+1l= a? Bhs dauvens act sdbesneneseseens snestaatan (4), 
’ L — py” ; . 
Me = Ng+1= ee rt) (m +2) Seuvencccoueseeseeeneecvedecsstcccee (5). 
\ A 


* Due to K. Pearson. 
+ K. Pearson, Phil. Trans., Vol. 186, p. 368. 
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: m a , 
Since —'=—! and a,+a,=total range R, we may readily calculate the constants m, m2, 
Mg ag 


a, and az. Moreover, the area of the whole curve is equal to the total frequency WV. This 
enables us to find the remaining constant yp. 


Example, \'="25. 
We readily find R=r (cos 6 — sin 8) =r x *707,1068, 


Fy = 96,2251, 


“her ne 
(5) = ‘074,07407, 
and the required Type I curve is found to be 
ee a/R \~ 1,500 a/R \ 31,606 
(4) 1°462,082 (14+ s5i035 1+) 366,005) 0 (6). 


How closely this curve fits the linear distribution may be judged from Fig. 2. 





— — — Y-4-476,6273 1° 475.6275 






3-01 
ae a} x 341,506 x --091,506 
Y-1-4.62,082{1+7556,035} {seeozs *1} j 
/ 
X= X-1:366,025 F 
1-57 





QUADRILATERAL. 


1-0H 


of FIGURE 2. 











= i it I 
oO 1 2 3 4 ‘5 6 | 36 ‘9 10 


Throughout the whole range of variation of \’, the 8, 8: line remains fairly close to the lower 
branch of the biquadratic. We may, therefore, reasonably expect a fairly good fit of the type 








. m 
Y¥=Yo (1 #2) , (Pearson Type IX curve). 
‘ a 
Range «= —a to 0. 
Identifying the range and meati of the two distributions, we have 
(™ + ;) py 
m,+2) R’ 
from which m, may be calculated. y is found by equating the whole area to the total fre- 
quency J. 
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For \’="25, the curve is found to be 
(4) = 1°476,6273 (1 r] RYOOTS ns eccecssecesseeseevenes (7), 


which is plotted together with (6) and the line in Fig. 2. It can be seen that (7) is almost as 
good a fit as the more general curve (6)*. 


Let us return now to the general distributions of the type : 


L 











0 x B 


The first two §’s were found to depend on two parameters A and p. The §’s, therefore, trace 
out an area on the 8), 8, plane. 





(i) When the point D varies between O and C, ie. -1<gp<0 and O<A<3, the B,, fy area is 
identical with the one mapped out by the first two f’s of trapezia. 
sin 6 
cos 6 — sin 6 
by the f’s is that part of the plane bounded by the §;, 8 lines of 
(a) linear distributions, 


(ii) When D lies between C and L, i.e. O< p< and 0 <A <3, the area traced out 
\ f 2 


(b) distributions represented by quadrilaterals formed by placing together a rectangle and 
a right triangle. 
~~ ; . sin 6 : . 
(iii) When D lies beyond L, i.e. p> i ae g2% the 8;, 8: area is bounded by a loop which 
COS G=—S 
sin 6 
cos 6 — sin @° 
When 0<A <3, the 8), By area is bounded by the loop A=2, p>1, ice. 
5 1 2 v I 59 PS 


8 (218+510p +294? + 2p)? 


is the envelope of the lines \=constant for }>’>2, p> 


5 


Pi= 100 (37 +26p +p?) 
Bo= 12 1225 + 1780p + 894p? + 196p3 + p* 
os (37 + 26p + p?)* ; 


III. Convex Curves. 

The first two §’s of the above quadrilateral distributions occupy a relatively small portion 
of the 8,, 8, plane and those which are convex at all points fall within the area bounded by 
the lines (i) B,=0. 

(ii) By=2"4. 

+ By LW ee lB 

(iii) (1 =), = (1 - 5) 342 Jl 9-4): 
Denote this region by C. 


* Fig. 2 provides an interesting example of the extent to which equality in the first four moments 
leads to correspondence in form. 
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Convex curves are necessarily of limited range and, therefore, of Pearson’s curves, those 
which are convex at all points must have the form 


(iv) y=yor* (1 —2yF. 
Differentiating twice 


2 
Gynt? (1 -.«)'-2[s (s — 1) — 228 (s+¢—1) +2? (s+ 2) (8+¢-1)]. 
nc” 


This is negative throughout the whole range provided s+¢<1, neither s nor ¢ being negative. 
This is, therefore, the condition for convexity. 


When s or ¢ is zero, the §’s of (iv) lie on the lower branch of the biquadratic, and when s=¢ 
the curve is symmetrical, giving 8;=0. Furthermore, when s+¢=1, we have 


&=5 (°-1) «= (2-1) (1+42). 
Eliminating 4 we have 
€ 
(v) 482-58; -8=0, 


vie : : : ‘ =0 
which is a straight line passing through the points he and L. 


By 
It is clear then that the region of convexity for the Pearson curves is bounded by 
(a) the lower branch of the biquadratic, 
(b) the By axis, 
(c) 48.—58, -8=0. 
This region lies entirely within C. 


(5) 
The normal curve y=ye *\%/ is convex between its points of inflexion. Let I/, denote the 
sth moment about the vertical through the origin and +ao the points of inflexion, then 


fag -() B F 
Ms,=2yy | @ So) 2% dx =yyo** Tye (8 +4). 
J 9 
Hence 


" , , ‘ : d*y E 1 
The points of inflexion are given by 5~,=0, ie. a= 55. Hence 


daz /2 


o ry (5) a My (5) 
which give B,=1°941, 8, =0. 
N+ ] 
2 
For one-half of the curve /=0° — 
| a 
from which we deduce B2=1°872 ) 
B,= 0273) ° 


Both these points lie within C. 
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For the quadrant of the ellipse 
y=yo(l-2t = =(O<e<)), 


ri 
MN, (about the origin)=y | (1—2x*)h xtdx 
0 


w(t? (FS) 


bo 
La | 

a 

n 

to| + 

P= 

ee 








r Ct *) 

Hence oe ee 
BT) ey 
+) ( +4 
2 
from which we deduce Bi, =1°9772 
B= maa 
The trigonometrical curve Y¥=Yo COS L ( - 5 €zr¢ : 


is convex. If M, denotes the sth moment, about the origin, of one-half of the curve, 


M,’=Y [° cos « dx= My p,, 


M,'=Yo. 
After reduction we find 


whence B= 2°2317, 
B,= °1797. 
For the complete symmetrical curve we find 
B= 2°1938, 
8, =0. 
Both these points fall in the region C. 
Finally, the curve Y¥=Y (l-—e=*) (a >0) 
is convex at all points. If we take the range (0, 1), we have 


1 
M, (about the origin) =y% | (1—e-*) x*dx 
Jo 


a £2 Fo. ee 
=% (i) + j sM',_;. 


Evaluating the integrals and referring the moments to the centroid vertical, we have 


ps= 058,830, 
pg = —°006,427, 
wy= 007,716, 


giving B.=2°2295, 


which lies within C. 
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Consequently, for all convex curves considered above, the first two f’s fall in the region C. 
I cannot conceive of a convex curve which does not give this result, and it seems quite probable 
that the §’s of all convex curves fall in the region C*. 





O ‘05 + -10 15 20 25 30 £35 
| 





1:7 I T 


role. FIGURE 3. 
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(* A direct proof that all convex frequency curves must lie in the area C would be of considerable 
interest. Ep.] 
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The Treasury ofHuman Inheritance. 
Part III. (Angioneurotic Oedema, Herma- 
phroditism, Deaf-Mutism, Insanity, Com- 
mercial Ability.) Price 10s. net. 

The Influence of Parental Alcoholism 
on the Physique and Intelligence of the 
Offspring. By Erart M. Exprrton, as- 
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State of the Science of National Eugenics. 
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Price 1s. 6d. net. 
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IV. On the Marriage of First Cousins. By 
ErHet M. Experton. Price 1s. 6d. net. 
V. The Problem of Practical Eugenics. 
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VI. Nature and Nurture, the Problem of 
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Second Edition. Price 1s. 6d. net. 
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IX. Darwinism, Medical Progress and Eu- 
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xX. The Handicapping of the First-born. 
By Kart Prarson, F.R.S. Price 2s. 6d. net. 
XI. National Life from the Standpoint of 
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F.R.S. Price 1s. 6d. net. 
XII. The Function of Science in the Modern 
State. By Kart Pearson. Price 2s. net. 
XIII. Sidelights on the Evolution of Man. 
By Kart Pearson, F.R.S. Price 3s. net. 
XIV. The Right of the Unborn Child. By 
Kart Pearson, F.R.S. Price 3s. net. 
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Walter Raphael Weldon. 1860—1906. 


(6d. postage). 


A Memoir. By Kart Prarson, F.R.S. Price 63. net 


The Skull and Portraits of Sir Thomas Browne. By Miriam Tinpestey. Price Two Guineas net. 
The Skull and Portraits of King Robert the Bruce. By Karu Pearson. Price One Guinea net. 
The Skull and Portraits of George Buchanan. By Karu Pearson. Price One Guinea net. 

The Skull and Portraits of Henry Stewart, Lord Darnley. By Kart Pearson. Price £1. 1s. 6d. net. 
A complete set of the four skull memoirs. Price Four Guineas vet. 


Application for these bound memoirs should be made to the Secretary, Biometric Laboratory, University 


College, London, W.C. 1. 


At the Yambridge University Press, Fetter Lane, E.C. 4. 





The Chances of Death and other Studies in Evolution 
By KARL PEARSON, F.R.S. Reissue. Price 30/- net. 


Vou. I 
1. The Chances of Death. 2. The Scientific 
Aspect of Monte Carlo Roulette. 3. Reproduc- 
tive Selection. 4. Socialism and Natural Selec- 
tion. 5. Politics and Science. 6. Reaction. 
7. Woman and Labour. 8. Variation in Man 
and Woman. 


Vor. IT 
9. Woman as Witch. Evidences of Mother- 
Right in the Customs of Mediaeval Witchcraft. 
10. Ashiepattle, or Hans seeks his Luck. 11. 


| Kindred Group Marriage. Part I. Mother Age 


Civilisation. Part II. General Words for Sex 


_and Kinship. Part III. Special Words for Sex 


and Relationship. 12. The German Passion 
Play: A Study in the Evolution of Western 
Christianity. 


Mounted Charts of the Weight and Health 
of Male and Female Babies 


Price 7s. 6d. net the pair, suitable for the walls of Baby-Clinics, or for plotting the 
erowth of individual babies to mark their progress. 


The following works prepared in the Biometric Laboratory 
can be obtained from H.M. Stationery Office. 
The English Convict, A Statistical Study. By CHaries GORING, M.D. 
Text. Price 9s. Tables of Measurements (printed by Convict-Labour). Price 5s. 
The English Convict. An Abridgment, with an Introduction by Kart Prarson, F.R.S. Price 3s. 
Tables of the Incomplete [-Function. Edited with an Introduction by Kart Pearson, F.RS. 
Price £2. 2s. Od. or by Post £2. 2s. 9d. 
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THE LIFE, LETTERS, AND LABOURS 
OF FRANCIS GALTON 


By KARL PEARSON, F.R.S. 


GALTON PROFESSOR, UNIVERSITY OF LONDON 
Volume I. Birth 1822 to Marriage 1853. With 5 Pedigree Plates 
& 72 Photographic Plates, Frontispiece & 2 Text-figures 


Price, Bound in Buckram, 36s. net 


“It is not too much to say of this book that 
it will never cease to be memorable. Never 
will man hold in his hands a biography 
more careful, more complete.”—The Times 
“A monumental tribute to one of the most 
suggestive and inspiring men of modern 
times.”— Westminster Gazette 


“It was certainly fitting that the life of the 
great exponent of heredity should be written 
by his great disciple, and it is gratifying 
indeed to find that he has made of it, what 
may without exaggeration be termed a great 
book.” —Daily Telegraph 





Volume II. Letters and Labours of Middle Life. With 50 Plates 


& many Figures in the Text 


Price, Bound in Buckram, 45s. net 


Cuarter VIII. Transition Studies: Art 
of Travel, Geography, Climate. 


Cuapter IX. Early Anthropological Re- 
searches. Transition from Geography 
to Anthropology. 


Cuapter X. The Early Study of Heredity: 
Correspondence with Alphonse de 
Candolle and Charles Darwin. 





CuapTer XI. Psychological Investiga- 
tions. Transition from Physical to 
Psychical Anthropology. 

CuapTER XII. Photographic Researches 
and Portraiture. 

CuaPTER XIII. Early Statistical Investiga- 
tions with regard to Anthropology. 
Transition to Statistics as funda- 
mental to Biological Enquiry. 


“For the student of the History of Science, as well as for the student of Galton, this 


volume is of prime importance 


binsees The volume is important and deeply interesting. 
It is splendidly illustrated.”—Glasgow Herald 


“Galton’s personality and achievements have taken their place in the history of 
science, and more than justify the sumptuous ‘Life, Letters, and Labours’ on which 
Professor Pearson has lavished special knowledge and labour.”—The Times Literary 


Supplement 


“It is a wholly worthy memorial of a very great man.”—Science 


“We prophesy that Pearson’s Life of Galton will be ranked by our descendants not 
very far behind Boswell’s‘ Johnson,’ and Trevelyan’s‘ Macaulay’.”— British Medical Journal 
“If our race continues to progress in the right direction, our descendants of, say, five 
or ten centuries hence will be insatiable in their need of information about such men 
as GALTON and DARWIN. They will bless Pearson for his devotion. If the great- 
ness of a man is to be measured by the product of his originality by his energy—and 
this seems the right way of measuring it—GALTON is certainly a very great man and 
his greatness will increase and not decrease as years and centuries go by.”—Isis 


Volumes III“ and III®. See p. vii below. 
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TABLES FOR STATISTICIANS AND 
BIOMETRICIANS. | Edited by KARL PEARSON, F.R.S. 


The new edition of this book consists of two Parts 





ParT I first issued in 1914 is now in the Third Edition carefully revised. It may be 
obtained direct from the Biometric Laboratory, University College, London, price: 
Inland, 15s. net, with 1s. postage to any address. Export, including postage $4. 


ParT II contains the Tables issued in Biometrika during the last seventeen years 
together with Tables not yet published, but recently computed. 


PRESS NOTICES OF THE FIRST EDITION 


“To the workers in the difficult field of higher statistics such aids are invaluable. ‘Their calculation and 
publication was therefore as inevitable as the steady progress of a method which brings within grip of mathe- 
matical analysis the highly variable data of biological observation. The immediate cause for congratulation is, 
therefore, not that the tables have been done but that they have been done so well....The volume is in- 
dispensable to all who are engaged in serious statistical work.” — Science 


“The whole work is an eloquent testimony to the self-effacing labour of a body of men and women who 
desire to save their fellow-scientists from a great deal of irksome arithmetic; and the total time that will be 
saved in the future by the publication of this work is, of course, incalculable....To the statistician these 
tables will be indispensable.’’—Yournal of Education 


“The issue of these tables is a natural outcome of Professor Karl Pearson’s work, and apart from their 
value for those’for whose use they have been prepared, their assemblage in one volume marks an interesting 
stage in the progress of scientific method, as indicating the number and importance of the calculations which 
they are designed to facilitate.’"—Post Magazine 


Recently issued 
PART If OF 
THE TABLES FOR STATISTICIANS 
AND BIOMETRICIANS 


250 pages of Introduction, 262 pages of Tables 





MAY BE OBTAINED FROM THE SECRETARY 
BIOMETRIC LABORATORY, UNIVERSITY 
COLLEGE, LONDON, ENGLAND 


Price, including packing and postage, Inland 33s. net, Export $7.30 


AT PRESS AND SOON TO APPEAR 
TABLES OF THE INCOMPLETE B-FUNCTION 











RECENT ISSUES OF THE GALTON AND BIOMETRIC 
LABORATORIES 


THE TREASURY OF HUMAN INHERITANCE. 
Vo. Il. Parr IV. (Nettleship Memorial Volume.) Hereditary Optic 
Atrophy (Leber’s Disease). By Jutta Bett, M.A., M.R.C.S., M.R.C.P., 
Hon. Galton Research Fellow. 100 pp. of Text, Chronological 
Bibliography of 153 titles, Figures of 238 pedigrees on 16 Plates and 
a Frontispiece Portrait of Theodor Leber. Price Thirty-six shillings net. 


This work, surely the most comprehensive, and with the possible exception of Leber’s original paper, the 
most valuable contribution to the literature of hereditary optic atrophy, is another notable product of the 
skill and industry displayed by Miss Bell, in this type of research. 

Hereditary optic atrophy is eminently "suitable as a subject of a part of the Nettleship Memorial volume. 
It claimed Nettleship’s active interest for many years; indeed, no less than twenty-three of the numerous 
observations here collected were recetded by him, and the author tenders in her work “a warm appreciation 
of the work of Nettleship on Leber’s disease.” 

A life-like portrait of Theodor Leber forms an appropriate arp pant following the letterpress are: a 
name-index to the chronological bibliography and to the recorders of pedigrees, a bibliography of 153 refer- 
ences; descriptive accounts of 16 plates, containing 225 pedigrees, some of which are extensive and complex. 

The production of the volume by the Cambridge University Press is, as usual, beyond criticism. 
British Journal of Ophthalmology 


Der vorliegende stattliche Band (100 Seiten Quartformat) erscheint als Teil rv des 11. Bandes aus dem von 
Pearson herausgegebenen Treasury of Human Inheritance und behandelt in umfassender Weise die hereditire 
Optikusatrophie, sog. Lebersche Krankheit. Ein ausgezeichnetes Bild Theodor Lebers aus der letzten Zeit 
erdffnet das Buch. Mit ausserordentlicher Sorgfalt hat die Autorin samtliche Falle der Weltliteratur zusam- 
mengestellt und in iibersichtlicher Weise 238 Stammbaume angefertigt, die am Schluss des Werkes auf 
16 Tafeln angeheftet sind. Das Buch ist sehr anregend geschrieben und zeigt aufs neue, dass Lebers Arbeit 
iiber die hereditare Optikusatrophie wirklich als klassisch bezeichnet werden kann.... 

Klinisches Monatsblatt f. Augenheilkunde 





Tracts for Computers 





XIII. BIBLIOTHECA TABULARUM MATHEMATICARUM 
being a Descriptive Catalogue of Mathematical Tables. Part I. 
Logarithms of Numbers. By JAMEs HENDERSON, Ph.D. Double Number. 
Price 9s. net. 


‘This Tract is a first and very substantial contribution to the realization of another of Professor Pearson’s 
projects—the publication of a new bibliography of mathematical tables. It is a descriptive catalogue of all the 
more important log tables, antilog tables and tables for the calculation of logs or antilogs to a large number 
of places. 

haan who has attempted even on a small scale to examine and describe collections of tables will recognise 
the magnitude of the task that Mr Henderson has undertaken, and will appreciate the devices which he has 
adopted in this first and perhaps heaviest part of the work to co-ordinate the results of his researches and to 
present them clearly and in a reasonable space. Apart from the value of the catalogue as a work of reference, 
the Introduction and the historical notes throughout the bibliography will render the Tract of interest to ali 
students of logarithms.’ —ournal of the Institute of Actuaries 


“Even the professional computer of to-day does not find it by any means easy to keep his knowledge of 
tables up to date. The last two decades have witnessed the complete modernising of our equipment of 
logarithmic, trigonometrical and calculating tables.... The computer is frequently at a loss to know where to 
turn for information concerning these new tables, and for guidance as to the best tables to use in his particular 
problems. The work before us is designed to satisfy this much-felt need, and in our opinion, achieves its 
object admirably....Because of the scarcity of comprehensive literature on the subject, and not less because 
of its intrinsic merits, we welcome Mr Henderson’s production.” 

Journal of the British Astronomical Association 


“This is a very complete and well-executed index to published tables of the logarithms of numbers. 
Each table is described in such a way as to give the computer all the information he needs to decide whether 
it would be useful in his work. Conversely, if the computer has definite requirements in mind for any special 
task, this volume will at once call to his attention the best tables for his purpose. It should be on the shelves 
of every institution that has a variety of computing to perform.”—The Astronomical Journal 
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THE LIFE,. LETTERS, AND LABOURS 
OF FRANCIS GALTON 


By KARL PEARSON, F.R.S. 


Volumes III* and III’, completing the work 
Volume III*. Letters and Labours of Later Life. With 44 Plates 


(3 in colours), many Figures in the Text, an additional Pedigree of 
the Darwin Family, and a large Sheet of Finger-Print Types. 





Cuap. XIV. Correlation and the Application of Statistics to the Problems of 
Heredity. CnHap. XV. Personal Identification. Story of the Finger-Prints. 
Cuap. XVI. Eugenics as a Creed and the Last Decade of Galton’s Life. History 
of Biometrika. Galton’s ‘“‘ Eugenics Record Office” and the Foundation of the 
Eugenics Laboratory. 





Volume III®. Characterisation of Galton, especially by his Family 
Letters. With 18 Plates and many sketches in the text. Appendix 
with omitted papers. The volume concludes with forty pages of 
Index to the four volumes. 


“Professor Pearson has now completed his monumental biography, that is in its way a survey 
of one of the most significant movements of the age, full of material which will be invaluable 
to the future historian.” —Daily Telegraph 


“ Now sixteen years after the appearance of the first volume, the work is complete. It will stand 
for all time as a monument to both subject and author. No other man of science ever had such 
a biography to preserve his memory.... The same infinity of painstaking care over the details of 
the production, illustration and documentation that marked the first two volumes is apparent 
here....And so comes to an end a remarkable, indeed a unique piece of biographical work, 
a fitting and adequate record of a great man.”—Science 


“The completion of this great ‘Life’ of a great man is an achievement and we wish to express 
what all interested must feel that the library of science has been enriched in a very noble way. 
We venture to congratulate Professor Karl Pearson on the success of his undertaking; he has 
given us a painting by a master. No doubt it has been a labour of love and not without the 
artist’s joy; but it has meant many years of strenuous sifting and appreciating and arranging 
to elaborate this worthy record of the life and work of one of the most notable pioneers in the 
history of civilisation....We would simply thank Professor Pearson for this monumental work, 
surely never excelled in completeness, accuracy, insight and keen judgment....We. must be 
allowed to express our admiration at the perspective and proportion that mark these volumes; 
amid the manifoldness of recorded achievement, there is no crowding or jumble, and this is 
the reward granted to an artist who mixes his paints with brains.”—Nature 


Price, Bound in Buckram, Volumes III* and III®, 69s. net 
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